The Cybernetic 
Theory of 
Development 


Mathematical Models for A Re-Evaluation 
of the Is-Ought Problem 


by 
Y. Ahmavaara 


UNIVERSITY OF TAMPERE, FINLAND 


KUSTANNUSOSAKEYHTIO TAMMI 
HELSINKI 


© by Yrjdé Ahmavaara 1974 
ISBN 951-30-2002-9 
KK:n kirjapaino, Helsinki 1974 


Introduction to the Problem of 
Ts’ and Ought’ 


One can hardly deny that the absolute separation of ’is’ and ’ought’ 
by Hume and Kant for a good part was due to their willingness to 
conciliate between God and human reason. Wittgenstein in his own 
separation of ’ought’ from ’is’ was quite outspoken. He writes ”God 
does not appear in the world” (Tractatus, 6.432) meaning that ’ Ethics 
Is transcendental” (ibid., 6.421). To show this was Wittgenstein’s very 
aim: “The aim of my book is ethical, ... My book confines the ethical, 
as it were, from the inside and I am convinced that only so it is exactly 
confinable” (in a letter to Ludwig von Ficker). 

Wittgenstein’s ’Tractatus’ is the work which perhaps more than any 
other single accomplishment underlies the linguistic separation of ’is’ 
and ’ought’ in modern logic. It can be examined as an effort 

1° to build a language in sentences of which all meaningful thoughts 
concerning ’is’ could be voiced and 

2° to show that all ethics, 1.e. thoughts concerning ‘ought’, are so 
excluded. 

Following up program 1 Wittgenstein constructed the simple proto- 
type of what is now called an extensional language. Later, even languages 
directly or indirectly concerning ’ought’ have become objects of de- 
scriptive study, leading to constructions of formalised intentional lan- 
guages in deontic logic and in modal logic in general. 

On which grounds lies the separation of ’is’ and ’ought’ in its modern, 
linguistic form? 

*x 


Let us approach the problem in terms of an illustrative model. Let 
X be the set of the thinkable states of the world. Then the set F(X) of 
all the subsets of X is the set of all the thinkable states of affairs. For 


INTRODUCTION TO THE PROBLEM OF ’IS’ AND ’OUGHT’ 


instance, the state of affairs that there is a red flower in a certain vase 
is represented by a certain subset A of X. This subset comprises all the 
states of the world in which the vase in question has a red flower. In 
effect every local state of affairs, local either spatially or temporally, 
singles out a proper subset of X, for it does not distinguish between the 
properties of the states of the world outside of some local environment. 

To each thinkable state of affairs A< X there is a meaningful sentence 
Pa in our extensional language, expressing that this state of affairs 1s 
valid. The sentence p, is true if the real state of the world x belongs 
to A, otherwise false. The logical negation, conjunction, and disjunction 
of sentences correspond to the formation of complement, cut, and 
union of the respective sets. This way we can build sentences also synt- 
actically. All the sentences expressing the state of affairs X are logical 
(analytical’) truths. Other true sentences express empirical (’synthetic’) 
truths or facts. 

So we have characterized a logic of ’is’ given in an extensional language 
(at an empirical level). 

But every real person is capable of distinguishing only a finite number 
of states of affairs from one another. Then for each person there is a 
partition B, of X, such that this person is able to grasp only the states 
of affairs representable as unions of the subsets B,. These states of affairs 
together make his personal, local world. We can say that the person 
is conscious’ of these states of affairs only. 


Let us extend the model to grasp some aspects of ’ought’ too. Every 
act certainly can be represented as a function f from X to F(X). This 
says only that the state of the world x, in which the act is performed, 
together with the act f itself singles out a definite state of affairs A e F(X) 
as the ’consequence’ f(x) of the act f under the premise x: f(x) = A. 
What peculiar state of affairs A is so picked depends on the definition 
of the act f in question as an empirical construct. 

For instance, consider the act of closing a window. If we regard the 
act completed first after the window has been actually closed, we have 
a constant act f whose consequence f(x) is the same state of affairs 
CeF(X) for any state xeX. But if we consider as act the movements 
undertaken in order to close the window, then the act may either succeed 
or fail. In this case we have two possible consequences, viz. the state of 
affairs C and its complement C in X. 


6 


INTRODUCTION TO THE PROBLEM OF ’IS’ AND ’OUGHT’ 


When speaking of acts we usually mean the conscious acts of some 
person or community. This is an act whose possible consequences are 
states of affairs which are conscious to the actor in question. A conscious 
act thus is a function f from X to F(X) such that the possible values 
f(x) of it are some subsets C,,C.,...,C, of X, finite in number, and 
representing some unions of the respective partition sets B,. The actor 
knows only the possible consequences C,,C.,..,C, but not usually 
the state of the world in which he is acting. That is to say he is usually 
acting under the conditions of uncertainty. 

The notions of act and conscious act so defined include, as far as 
I can see, all the meanings in which one has spoken of acts. We have 
stated only that an act may bring into being some state(s) of affairs. 
This I think cannot be denied of any kind of act. 

The notion of value-judgement can in equally general terms be defined 
as setting up a certain preference order between the possible consequences 
of the possible acts an actor is capable to conceive. Every subjective 
value-judgement of a real person is based on a certain preference order 
of some states of affairs conscious to him, 1.e. of his B;’s or their unions. 
If there is an objective ethical value, it must be represented by an order 
of preference of the states of the world xeX. 


Admittedly all our observations are acts which bring into our conscious- 
ness a certain empirical fact, viz. that the true state of the world x belongs 
to a certain proper subset A of X or, what is the same, that a certain 
state of affairs A is valid — and nothing more. This is what Wittgenstein 
obviously wanted to say. (An observation thus is a function f from X to 
F(X) such that xef(x) = A for any x and A.) 

But is all ethics really so excluded from any confrontation with empiri- 
cal knowledge, indeed outside the capacity of human reason (’trans- 
cendental”), as Wittgenstein thought? 

In practical life we of course set up many kinds of orders of preference 
between different states of affairs, say, to settle which instrument is 
best for each task, and all this 1s done on the basis of the observations 
we have on the relevant properties of the different instruments. However, 
a genuine ethical value is concerned only when it comes to the meaning 
of life or perhaps of the world in general. Wittgenstein stated his point 
as follows. 

The meaning of the world must be outside of the world. In the world 


7 


INTRODUCTION TO THE PROBLEM OF ’IS’ AND ’OUGHT’ 


everything is as it is and everything happens as it happens... All what 
happens and is in this or that way is chance. What makes of value 
something else than chance cannot be of this world, it cannot be in the 
world, since otherwise it would itself be chance. It must be outside of 
the world;” (Tractatus, 6.41.) 

Wittgenstein seems here to voice the idea that a genuine ethical value 
is connected with some kind of necessity, a factual necessity and not 
merely a logical one. As he admits only logical necessity as existent 
in the world (sentence 6.375), he cannot find any genuine value in the 
world. 


But if you take as the ’world’ the human world, composed of mankind 
and its productive system (inclusive the relevant part of nature), you 
definitely have better prospects to find a factual necessity and also a 
genuine ethical value in the world. The natural laws of course restrict 
the factual possibilities of action of human beings. And so do all the 
previous, completed acts of people. Natural laws + previous human 
activity together determine the factual necessity in the framework of 
which every human actor whether an individual or a collective is bound 
to act. 

Translated to the language of our model: If X is interpreted as the 
set of the thinkable developmental states of the human system, then 
there is a proper subset D of X which expresses the factual necessity, 
and comprises the possible states of the system. Possible acts are those 
whose possible consequences are composed solely of the elements 
of D. 

Further on: If X, and D, mark the respective scts of states at the 
moment t of a (for the sake of simplicity) discrete calendar, then D, is 
determined by the developmental law of the human system. This law 
expresses the natural laws + the laws due to previous human action. 
Such a law is representable by a function » from X, to X,,,, under the 
assumptions made. 

Now we can make the crucial question. Is there such a developmental 
law governing the history of mankind? A linguistic idealist is inclined 
to call such a law a "metaphysical belief” But for a cybernetician the 
question 1s quite meaningful. He would answer that the answer wholly 
depends on the structure of the system in question. If there are sui- 
table couplings of feedback in the structure of the human system, 
then it is quite possible that a law ¢ exists, and gives to the system a 


8 


INTRODUCTION TO THE PROBLEM OF ’IS’ AND ’OUGHT’ 


goal-directed, purposive” nature. He would call D, the domain of ergo- 
dicity or the domain of selfsteering of the system at the moment t. 

The law of development ¢ expresses causality and thus is deterministic. 
But human beings as active parts of the system are themselves creating 
a share of this causality. So the development is not necessarily blind 
but 1s capable of expressing also human intentions. 

The theoretical limits within which the human pursuit may express 
itself are determined by the domain of ergodicity, and thus ultimately 
by the natural laws + earlier human history. The largeness of this 
domain thus is a measure of the capability of human beings to realize 
their humanity — or whatever they want to realize — by means of their 
power over the natural forces and over the laws of their own develop- 
ment. In the sense first defined I think by Engels it thus 1s a measure 
of human progress and indeed an objective foundation of ethical value 
-- as derived from his world. 

To sum up: in a cybernetic construction, taking into account the 
feedback structure of the human system, you may get a more intricate 
relation between is’ and ’ought’ than is customary assumed in current 
linguistic construction. This is the idea we are going to study in this 
book. 


How much Marxist 1s this idea? The work of Oskar Lange and Marxist 
philosophy will be much quoted in the book, the latter especially in 
the form exposed by the GDR collective of philosophers in their book 
"Marxistische Philosophie’ (Dietz, Berlin 1967). But the present book 
only suggests mathematical models, and such models must be distin- 
guished from philosophical formulations at the verbal level. So it is 
more exact not to call my approach a Marxist one but rather — cy- 
bernetic. 

Because of the conspicuous similarity of some Marxist formula- 
tions and certain cybernetic ideas Marxist terminology will often 
accompaign the cybernetic constructions in this book. But here we 
of course meet the problem of compatibility of verbal language and 
mathematics. To my mind a mathematical model is in science always 
something more than a verbal philosophical formulation. It at least 
attempts at an unambiguity and exactness which is unattainable when 
using verbal language only. Therefore the cybernetic constructions in 
this book are an autonomous theory, which must be appraised on its 
own accord only. 


INTRODUCTION TO THE PROBLEM OF ‘JS’ AND ’OUGHT’ 


The aim of the book is thus a cybernetic one: to introduce a cybernetic 
approach to the problem of ’Is’ and ’ought’. Who wants to grasp 
the idea quickly he can read Chapters III and V. For a more complete 
picture include Chapter II and ChapterIV. Chapter I is there only to 
make a textbook: to give the necessary details for a beginner who is not 
privy to elementary mathematics. Chapter I is quite trivial, but to my 
experience just this kind of knowledge a social scientist wanting to read 
cybernetics is wanting. 


Contents 


CHAPTER I Elementary Mathematical Tools 


1 § What is Mathematics? 


I? A formal language? is .2408 oo ss cea hns 20s 2N 6 SEEN dake kd Pere 
2 / The failure of the Formalist program 
3 / Non-linguistic aspects of mathematics 


TP DCU hw ook PA aos Rees wo Oe od. OR ER SAA Ee eae 
2:7, FAINCUON: Lina wc ohne soe hatin bos ena ah oe arto w os wale eas Saree 
3 / Function space 
QS) N GCUOFM SPACE 2.3 oboe ie ot eae eg coe ees 28 OREw Stee ees Hees 
S.J Mathematical relation’ 605 feed ony Soke Begs ha SR SESE oho 4 SHES RE SR 
6 / Invariance and transformation 
7 | Algebraic operation 


3 § Matrices 


LF WEatEIX BISG Dia: tori ese ee ee sd eal ates Saal a es ees 
2 / Linear transformation 
3 / The inverse of matrix 
4/ 


Eigen values 


Cr ee 


4 § Real Functions 


J DEFiVatiOn:  Sicwscnnc ine cided au saie eee eee benees wera tad weneas 
| Derivatives of elementary functions .............. 0... cece cece eees 
/ Aialyue function: .itsc conten s canine es ceaedue aesoeevaneanees 
/ Partial and total derivatives 

Integral and integral function 


Space integral 


1 
2 
3 
4 / Partial and total derivatives 6... cece eee eee ees 
5 / Integral and integral function ........... ccc cece cece ence eee 
6 


5 § Topological Notions 


Ly Topological: Space... 62 2 6b 5h. h oe bee shame cee hb ewas mis 
2 { Topological mappings 
3 / Topological manifolds 


6§ Complex Numbers facac said Jicieveeealewey oniedesates 76 


1 / What are the complex numbers for a science ...................04-- 76 
2 | The algebraic operations on complex numbers .................--- 77 
3 / The roots of algebraic equations ............. cece eee ee eens 79 
CHAPTER II Fundamental Cybernetic Notions 83 
1 § A Glance at the History of Cybernetics .................. 83 
Lf DOSCATIES:. saiosc tes aia eae hee Feiwek AE av ese ae ub aeeds 83 
BD I-PAVIOV - cccence bos wae Adnaee oo rd nes Lagoa tees HORSE VeRO 84 
S WICKED, ~ koataweie tt ete sek eR oe eee eee AE Oe eee ee 85 
4 / McCulloch and von Neumann ............... 0.2 cece cece eee eee 85 
> Oskar Lanke. ” cesetecicdocece ea ca eeuwe ee Uae eee Shs Pe ee 86 
2 S-SYSteni NOUONS:: ccs caaaden ior eaeeiS owe shou eres ee eee 87 
1 J General SYSIEMS:. so echasieeies hohe we abda sh Pasion ha iwa debe 87 
Zf- Material SYSt€MS: «i265 es on ok Sw de Gao adr Sa ed ADR ORS 88 
3 / Systems of definite topology ........... 00. cece cc cee eee 9] 
4) Cybernetic SYStEMS 66 ee ied oh side wwereuk aoe beeen etal Beebe ae weet 94 
5 / Explicit introduction of time in a cybernetic system .............. 99 
6.) Digitalsystems:: “econ seek ei aes eackel ewe eas te eae wee hae 103 
Tf AMAlO’ SYSEMS: | — 55 si bac hae abo ceding oho ee teed eon ee 110 
3 § The Notion of Cybernetic Whole... . 0... eee ee 114 
1 / Wholes and components... ... 1... ccc ce eee e eens 114 
ZaPORRARIZANON 3 8525 tesa ais ok wand Md en oe eid e wea eee eee 115 
3 / Structural organization: cybernetic coupling ...............00000-- 116 
4 / Qualitative and quantitative aspects of system objects ............ 121 
5 / Input organization: the notion of input information .............. 125 
6 / Output organization: the cybernetic notion of action .............. 132 
7 [| Levels of organization 1... 0... ccc ccc eee teen ee eee 138 


CHAPTER III Cybernetic Theory of Self -Generating 


Processes 141 
1 § The Self-Generating Process .......... 0... cece eee eee eens 141 
PsGeneral ios 6. ugindicd dba ewe ced tee eh Flee oer 141 
2 / Dialectical contradiction within a cybernetic whole ................ 143 
3 / The inner law of motion of a whole... kc ec ee ees 146 
2 § Cybernetic Systematics of Self-Generating Processes ........ 151 
1 / The study of internal contradictions ’in the small’: cybernetic categories 
Ol CONIAGICHIONS: § -aigiexiie kee Po aten ww eet ck eR alee eee da 151 
2 / The study of internal contradictions ’in the large’: purposiveness and 
CLROGICNY ces vires et ete h oa nave Gates ht ewe se pee ust ney exeRs 161 


12 


3 § On Future Development and Open Problems of Cybernetics 
1 / The need for a theory extending over the successive phases of self- 


168 


gencrating dialectical process: the problem of complication ........ 168 
2 / The need for spatial localization of cybernetic systems: cellular and 
tessellation MOdelS\226.25-4:25.2 4). 46.0.4540.565 See we Serer reset htes ak 
3 / The need for realistic probabilism: thermodynamic models and error 
TNEORY. “oo ceihce ieee Eee eal eee < ha Oh Bee See ek 170 
4 / One further need in future cybernetic theory: a theory of sensitive 
SYS(CINS: (Mou hi Gge sd to avaaus Maw uate ee Soren eon geese a eee tas 173 
5 | What will be preserved of present cybernetic theory? .............. 173 
CHAPTER IV The Cybernetic Model of Rational Actor 177 
1 § The Turing Machine as a Cybernetic System .............. 178 
1 / The human cognitive system .............cec cee cec eee encereeceee 178 
2 / Description of the Turing machine .............. cece cece cece eens 179 
3 / Computation in the Turing machine ............. cece cee ee eeee 182 
2 § The Turing Machine as an Idealized Model of Rational Actor 184 
1 / The Turing machine as a model of the optimal organization of a 
FatiOnal <AClOR® 3645 sede Fawn eee ie eee bee has eae de cue 184 
2 / The universal Turing machine as a model of an optimal rational 
DCO. eewsieses tees Cou seaek are a eal cua 6 saan te ence eee 186 
3 / Does the recursivity of its operations make the Turing machine 
intellectually inferior to the human brain? ...........-.. cece eee 193 
CHAPTER V Cybernetic Logic of Social Development 197 
1 § The System-Character of the World ..................005. 198 
1 / Universal causal determinism .............cc cece ene e ccc eeeceeeces 198 
2 / Development in multi-ergodic systems ...........- eee ee ee ee eens 203 
2 § The Notions of Necessity, Possibility, Chance and Freedom 206 
1 / A tangential model of development ................. cece eee ec eeee 206 
2 / The semantics of logical modalities .............. 2. cee eee ee eee 209 
3 § The Materialistic Foundation of Modal Logic and Deontics 212 
1 / The logical truths and the dialectical truths .................005 212 
2 / Strict implication and the modal operators ...............0.000es 214 
3 / The materialistic and the idealistic approach in modal logic . 215 
4 /Is many-valued modal logic needed? .......... ccc ccc cece cee 217 
5 | Objective ethical value ........ ccc cc ccc ccc cc ccc cee cnc n cee ceees 217 
5 / The modal logic of social structures ......... 2... ccc cece eens 219 


13 


CHAPTER I: 


Elementary Mathematical Tools 


I § What Is Mathematics? 


1 / A Formal Language? 


The most striking thing in mathematics undoubtedly is its playing with 
signs. Sometimes mathematics itself was held a conventional game 
comparable to chess. 

To have a more realistic picture of mathematics we should begin by 
distinguishing between an empirical and a theoretical level of knowledge. 
Our sensory observations concern what could be called ’empirical 
objects’. The empirical objects are objects which — though they are 
themselves idealisations of a certain theory conceived in our mind — are 
directly related to the real objects, and represent the images of the 
latter in our consciousness. 

The theoretical objects dealed with at the theoretical level of knowledge 
are usually not directly related to the observations of real objects. The 
human mind ’invents’ them by her creative imagination, out of the 
material offered by the observations of reality. The theoretical con- 
structs considered in mathematics are no exception to this general rule. 
In this sense we can say that the mathematical notions are based on 
the observation of reality. And their final purpose is like the purpose 
of any theoretical constructs to ’explain’, to give systematics to observ- 
ations. 

So the playing with signs in mathematics is not an end for itself. 
Still the game-aspect too is there. It is inherent in the study of mathe- 
matics as a deductive system, as a formal language (a calculus). In terms 
of elementary mathematical notions — you find them explained later 
in this chapter — a formal language can be characterized as follows. 


15 


CHAPTER I 


It has a finite or infinite set E of basic signs. Every finite or infinite 
sequence zuv... of the basic signs belongs to the set M of signs con- 
sidered in the language. There is a subset A of M comprehending the 
’expressions’ of the language. There is a function f from the set F(A) 
of all the subsets of A to the same set F(A), such that 


1° X ¢ f(X) for any X < A, 
2° {(X,) < f(X,) for any X, © Xz, 
3° f(fCX)) < f(X) for any X ¢ A, 


4° for any z ef(X) and any X ¢ Athere is a finite set X* © X such 
that z « f(X*), and 


5° there is a set S < A such that f(S) ¢ S. 


Here S is the set of the ’theorems’, or true sentences. The function 
f is the operation of logical deduction as defined for this language. 
The study of the formal languages, i.e. combinations (E, M, A, f), 
involved in mathematics means studying the /ogical syntax of mathem- 
atics. 

In the /ogical semantics of mathematical notions one studies the con- 
nections of formal languages with the theoretical objects of which ts 
spoken in the language. A formalisation of the language used in a 
semantic analysis leads to the notion of ’meta-language’, 1.c. a language 
used when studying the logical properties of the object language. This 
evokes the question on the existence of a single great formal language 
in terms of which all mathematics could be expressed. Is that pos- 
sible? 


2 / The Failure of the Formalist Program 


If mathematics is a formal language, it should be possible to ’play 
mathematics’ without taking any notice of the meanings associated 
with mathematical symbols. The mathematical truths could be deriv- 
able by purely formal rules of mathematical calculus, assuming one 
could but find out the rules of this game. How should such a calculus 
look like? 

One comes to think of a calculus, where all the true statements can 


16 


CHAPTER I 


be formally derived from a number of axioms by a number of rules 
of inference. By means of the rules of inference a complete recursion 
of all the mathematical truths to the chosen axioms would be possible. 
Therefore, this hypothetical calculus was called the "Complete Recursive 
Calculus” by the Formalists who had the idea in mind. 

The idea of CRC was refuted by a well known theorem of Gédel 
(Kurt Godel, Uber formal unentscheidbare Satze der Principia Mathe- 
matica und verwandter Systeme, Monats. Math. Phys. 38, p. 173—198, 
1931). The theorem states that any adequate consistent arithmetical 
calculus is incomplete: there are always true statements about the 
integers which cannot be proved within the calculus, that is, which 
cannot be derived from the axioms of the calculus by the rules of inference 
of the calculus whatever axioms and whatever rules of inference we 
might assume. This means that a class of fundamental mathematical 
calculi are rather IRC, Incomplete Recursive Calculi, than CRC. 

What is the philosophical content of Gédel’s proof? It gives evidence 
to the effect that mathematics cannot be comprehended as a closed, 
formal, and conventional system as was supposed by the Formalists. 
On the other hand, Gédel’s theorem is in good accord with dialectical 
materialism, according to which even the mathematical truths reflect 
material reality. The component of reflexion, the foundation of mathe- 
matical truths on the reflexion of material reality in the consciousness 
of the human being, cannot be eliminated from mathematics just as 
little as it can be eliminated from physics, from botanics, or from zoology. 
Mathematics, just like physics, or botanics, or zoology is a science 
telling something on the phenomena of material reality. It dwells on 
different phenomena and on a higher level of generality than the other 
sciences but is still as deeply anchored in our observation of material 
reality as are the other sciences. The use of formal symbols helps in 
the communication of mathematical ideas but mathematics is not 
just a formalism. 

Though the early Formalist Program was wrecked by Gédel’s theorem, 
the conception of mathematics as a formal language is today almost 
as strong as it was in 1931 — and before that —in the countries of Posi- 
tivism. This is because this conception is a part of the positivistic in- 
terpretation of sciences in terms of linguistic idealism. "Logic and 
mathematics are based, according to the ideas of the neopositivists, 
on a system of quite arbitrarily assumed axioms and rules, which is 
a similar product of convention as are the rules of chess and a card 
game.” (The Foundations of Marxism-Leninism, 2. Finnish edition, 


2 — Cybernetic method... 17 


CHAPTER | 


p. 47—48, 1961). Indeed, the conception of mathematics as a formal 
language composed of axioms and rules of inference chosen by more 
or less aesthetic convention ("simplicity” etc). goes on as a part of 
the positivistic philosophy of mathematics. Another part of this phi- 
losophy is the conception that all that is fundamental in mathematics 
can be attained at by a logistic analysis of the logical syntax of the 
"language of mathematics’. This is wrong too: there are obvious non- 
linguistic aspects of mathematics. 


3 / Non-Linguistic Aspects of Mathematics 


Every mathematician knows by experience what a great role sensations, 
perceptions, visual images and geometrical illustrations play in mathe- 
matical thinking. All these are elements that connect the mathematician’s 
work with the observation of material reality and with the so-called 
empirical’ sciences .A successful mathematician is in his work often 
directly pulled by concrete problems of physics, chemistry, biology, 
or economics. The aspects of mathematics as a science studying problems 
of material reality are badly neglected by the purely linguistic con- 
ception of mathematics. 

A more realistic philosophical interpretation of mathematics must 
start with the fact that mathematics — like physics, botanics, or zoo- 
logy — is a science devoted to the study of certain aspects of material 
reality. Physical particles, plants, or animals are not themselves studied 
by mathematics but certain sets of particles, animals, and plants, and 
certain proportions holding between the elements of such sets may 
well be. Similar sets and similar proportions may appear among other 
things of reality. Mathematics is a science studying such sets and pro- 
portions whereever they may appear in reality, or in an imaginable 
reality. 

A student of mathematics is soon fascinated by the generality of 
the aspects of reality revealed to mathematical observation. The sets 
and the proportions studied in mathematics approach universal validity, 
and appear in the most different contexts when compared with the more 
restricted aspects of reality studied by physics, botanics, or zoology. 
Still one can say that the starting point of mathematical study is in 
the observation of certain specific proportions holding between the 
elements of certain specific sets. Mathematical discovery is based rather 
on the study of the specific and concrete than of the general and abstract. 
To solve the seeming paradox we here meet: the generality is not pri- 


18 


CHAPTER I 


marily sought in mathematics: the generality comes from the fact that 
a closer examination of just the exact, specific proportions holding 
between the elements of a certain set often reveals similar specific 
proportions as appear in some other sets of quite a different nature. 
It is a typical characteristic of the mathematical method, I dare to say, 
that it is based on a close examination of the specific prior to the general. 
Here again is a non-linguistic feature of mathematics which is missed 
by the positivist philosopher who is interpreting the whole mathema- 
tics in terms of the general syntactical categories of a language. 

Above I have referred to some of the non-linguistic aspects of mathe- 
matics which the positivistic linguistic interpretation of mathematics 
fails to see. On the other hand, they are well in accord with the reflexion 
theory of dialectical materialism. To make it more precise and sum up 
the materialistic point of view we can say: 

Logical deduction based on some axioms and on some rules of inference 
is an aspect of mathematical reasoning. However, in addition to the 
linguistic, grammatical element associated with the language of deduction 
there are non-linguistic, non-grammatical aspects of mathematics. Mathe- 
matical truth is not based on the finding of a ’correct’ logical syntax 
for the language composed of the axioms and of the rules of inference 
but rather on the correct observation of certain specific proportions holding 
between elements of reality. Thus the foundation of mathematics is not 
to be sought solely in its linguistic grammary but in the whole reflexion 
of reality in human consciousness, in the observation of reality. This 
observation never ceases to bring forth new features of reality, and thus 
new features of mathematics. 

Of course, this does not mean diminishing the value of the linguistic, 
logistic study of mathematical rules, but it only rejects a philosophical 
interpretation of mathematics, based on solely its linguistic aspects. 


2 § Elementary Mathematical Notions 
1 / Set 


We spoke above of the observation of specific, exact proportions hold- 
ing between the elements of some sets. These sets and these proport- 
ions, with which mathematics is concerned, are something existing 
in reality, or represent something which may exist in reality. What 
are the "elements” and what are the ”sets”? 


19 


CHAPTER I 


If we consider the material components of which, say, a motor car 
is composed, these components together give us an example of what 
is meant by a set. However, these components form the same set irre- 
spective of whether we consider them as members in the whole called 
’car’ or whether we consider them as separated from one another, 
i.e. if we take the car to pieces and consider the collection of these 
separated pieces. Thus the motor car is something more than just a set 
of its elements: it is a set where the elements stand in particular pro- 
portions to one another. A material thing is a whole where both a 
set and some proportions holding between the elements of this set are 
involved. Distinguishing, when observing a material thing, between 
the set of its components and the proportions of these components 
(elements) to one another is the beginning of mathematics. The fact 
that a component x is an element of the set A of all the components 
is denoted as an "epsilon relation” as follows: 


xeA. 


A property of the ordinary material components of which a 
material thing is composed is that they can be ordered to form a 
sequence. We can, for instance, order the components of a motor car 
along a straight line where each of the components lies between two 
neighboring ones. We can then associate a number to indicate the order 
of the components in the sequence, and write the sequence as follows: 
X1, Xo)--,Xp_, Where each of the x, represents one component. If A 
is the set composed of these components we now denote the set by 
writing 

A = {x1 Nepcck's Xap: 


Another way of denoting the set A is to introduce its ’general element” 
and write 


A = {x; x =X, Xa-- 5 Xa}. 


In this denotation the x,,x.,..,x, are the ”values” allowed for the 
general element in the set A. For each element we can write epsilon 
relation separately: x,eA for each k = 1,2,..,n. 

Obviously the conception of natural numbers, by which we mean 
positive integers, is closely connected with the conception of the set 
of the material components of a material thing. Once a human being 
has been able to distinguish between such a set and the proportions 
in which the elements of this set stand to one another he has began 


20 


CHAPTER I! 


to develop the conception of natural numbers. Thus the beginning 
of mathematics has been simultaneously the beginning of counting 
things. 

If we take first a material thing to the pieces aj,d,,..,a,, and then 
each of these components to smaller pieces, and continue in this way, 
it may happen that the reduction can be continued without any limit. 
We do not know if this is really possible — even the modern theory 
of elementary particles gives no definite answer to this question. How- 
ever, to the possibility of such an unlimited reduction of a material 
thing there corresponds the idea of continuing the sequence of natural 
numbers without limit: 1,2,3,... The number of the elements of the 
set A = {1,2,3,...} is greater than any fixed number 7. We call such 
a number infinity and denote it by the symbol oo. The set of all natural 
numbers can now be denoted by A = {1,2,.., ©} or, using the general 
element, by A = {x; x =1,2,.., a}. 

With the acceptance of sets having an infinite number of elements 
some strange phenomena appear. Take, for instance, the set 


B= msm =o, ly 2c OS +2,...40} 
n 


of all rational numbers. At first glance it seems that this set has more 
elements than the set of natural numbers, since all the natural numbers 
are included but there are, in addition to this, all the non-integer rational 
numbers, and zero. However, we can order the rational numbers 
beginning with: 


Then we can give each rational number in this sequence an order number 
writing this sequence again as 


Xy, Xa, X3, Xa, X5, Xe, X77 Xe, Xr X10. X1y--- 


Thus we observe that the set B of all rational numbers and the set A 
of all natural numbers must be understood to have the same "number 
of elements” in the sense that both of these numbers can be ordered 
to an infinite sequence of numbers. We express this by saying that 
the sets A and B have the same cardinal number §&, (the letter reads 
”alef”). By the cardinal number of a finite set we understand the number 
n of its elements. We write Ny > 7. 


21 


CHAPTER I 


To express the fact that all natural numbers are among the rational 
numbers we can use the "inclusion relation” by writing 


Ac B. 


A is called a subset of B. If A and B are whatever two sets of whatever 
elements, then A ¢ B if, and only if all the elements of A are also 
elements of B. In other words, we have an inclusion relation A < B 
if, and only if from an epsilon relation xe A follows another epsilon 
relation x eB for every element x of A. Using the symbol of ’implication” 
=> for the phrase "it follows” we can say that A<B means the same 
as xcAs> xcB. 

A subset A of a set B which is different from B is called a proper 
subset of B. We have observed that even a proper subset A of B may 
have the same number of elements as the main set B, if B is an infinite 
set. For a finite set this is obviously not possible. 

The set composed of all the subsets of a set B is denoted by /(B). 
The set B itself is counted as an element of F(B), and so is the “empty 
set” © that contains no elements; Be F(B) and Me F(B). From each 
element x of the set B we get the subset {x} containing only the element 
x; {x} e F(B) or, what is the same, {x} ¢ B. 

For each element xe B and Ac F(B) we have either xe A or x€ A (x does 
not belong to A). Thus we have two possible relations for each element 
of B and each subset of B. Going through all the elements and subsets 
of a finite set B we get all in all 2” possible combinations of elements 
of B into subsets of B, provided that 7 is the finite number of elements 
of B. Thus the number of subsets of B is 2”. Indicating the number 
of elements of a set by the sign + we thus have, for any finite n, 


+B=n, +F(B) = 2” (a finite 7). 


This rule can be extended to infinite sets. If m is the cardinal number 
of the infinite set B, then the cardinal number of F(B) is 2™, there being 
always 2™ > m; 


+B=m, +F(B)=2™>m (an infinite m). 


Accordingly, there is an infinite sequence of distinct infinite cardinal 
numbers, since we can begin with Np, and generate the infinite sequence 


No» 2¥o =N> No, 28 = Ni > N, 2". = No > Nip etc. 
An example of an infinite set of the cardinality § is the set R of all 


22 


CHAPTER I 


real numbers. We get all the real numbers by taking first all the rational 


numbers of the form r = = (m and n integers), and then add the limits 


r =lim <= where m and n are integers approaching infinity. The latter 
kind of real numbers are called irrational numbers. One cannot order 
all the real numbers to a single sequence. Thus the cardinality of R 
exceeds the cardinality N, of the natural numbers and of the rational 
numbers. One can show that this cardinality is equal to 2%o, It is de- 
noted by NS, as was indicated above, and called the cardinality of the 
*continuum’. 

In a closer examination of infinite sets we meet surprising curiosities 
and paradoxes, which have became a common object of study of higher 
set theory and mathematical logic. Here we do not go into these proble- 
matics but finish our introduction to the notion of set by introducing 
some notations useful when operating with sets. 

If A and B are two arbitrary sets we can form their union AU B, 
their intersection Aq B, and their cartesian (or set-theoretical) product 
A» B. These new sets are defined by 


AUB = {x;xeA or/and xe B}, 
An B = {x;xeAand xeB}, 
AXB = {(x, y); xe A and ye B}. 


The union A U B thus is the set of all the elements who belong either 
to A or to B or to both of them. The intersection A n B is the set of the 
common elements of A and B. The product A xB Is the set of all the 


pairs (x, y) where x belongs to A and y to B. 
All these operations can be generalized to any number of sets: 


A,U A,U .. UA, = {x; xe Ay Or XE Ag OF.. or xe Ay}, 
A,NAgN ee NA, = {x; xe A, and xe A,and..and xe4,}, 


A,X AegX ee XA, — £0X4) X95 2 9 Xn)? Mi CAs Xe Assy. x,€A,} 


2 / Function 


Another important notion of mathematics is connected with the specific 
proportions in which the elements of sets may be with one another. 
This is the notion of function. 


23 


CHAPTER I 


If we associate with each element x of a set A one and only one ele- 
ment y of another set B we have defined a function f from the set A to 
the set B. This is denoted by writing 


f:A—B. 


A function f thus arranges to every element x of A an element y 
of B in such a way that the element y is uniquely determined by the 
element x. The element y is called the ’value” of the function f corre- 
sponding to the value” of the argument x. The connection between 
the value of the function and the value of the argument is expressed 
by writing 

y =f(x), or x4y. 


The set A of all the arguments of fis called the domain of definition 
of the function f, and is denoted by D;. Thus D; = A. The set of all 
the values of the function f is called the range of the function f; and 
is denoted by R,;. Thus Ry = {f(x); xe A}. The range is obviously a 
subset of B: Ry <B. It may happen that R; is a proper subset: RK; # B. 
This means that not every element of B appears as a value of the function 
f. In this case we can say that f is a function from A into the set B. If 
Ry = B, we can say that fis a function from A onto the set B. 

There may be several elements x,,x2,... of A which all have the same 
element y of B as the value of the function f: f(x,) =f(x.) =.. = y. 
Thus we observe that to an element y of R, there may correspond, 
in the association committed by the function f, several elements of 
A so that the functional relation between x-elements and y-elements 
cannot be inverted. 

However, if the function f is such that to every two different elements 
x, and x, of A there correspond two different elements y, = f(x,) and 
Yo = f(x.) of Ry, then the function can be inverted. We can then write 


x= f"0); 


where f~ is a function from Ry; onto the set A. 

In the latter case the function f (and likewise its inverse f—1) introduces 
a one-to-one correspondence between the elements x of A and y of R,. 
This can be indicated by writing 


x<— y is I—1. 


Obviously, a function f may have an inverse f~ only if the domain 
D, and the range Ry have the same cardinality. 


24 


CHAPTER I 


If f is a function from A onto B, and g a function from B onto C, 
we can form a composite function gf from A onto C. It is defined by 


(gf)(x) = gf). 


en a a 


Fig. 1. A function having no inverse 


Fig. 2. A function having an inverse 


Fig. 3. A composite function 


If C = A and g = f the composite f/f is a function from A onto 
the set A itself. For the value of this function corresponding to the 
argument x we get: (f4f)(x) = f'(f(x)) = f7Q) = x. We can also form 


25 


CHAPTER I 


another composite function f—! for which we get ({f)(v) = fU70) = 
= f(x)=y. These results are denoted in short by writing 


bi aD | ian 


where 1 is used as a symbol for the identity function. Such a function 
is defined by 


I(x) = x for all xeD,, 


D, being the domain of definition of the function 1. 


3 / Function Space 


Just as when dealing with a given set A we are usually interested 
also in the set F(A) of all subsets of A, when dealing with functions 
from a set A to a set B we are often interested also in the set /(A,B) 
of all functions from A to B. This is in particular the case when B is 
the set of all real numbers. We have then the set F(A,R) of all functions 
from a given arbitrary set A to the set R of all real numbers: 


(1) F(A,R)= {f; f:A — R}. 


If f, and f, are two functions from A to R, and if f,(x) and f,(x) are 
the two real numbers which are the values of these two functions for 
one and the same argument x, we can add the two real numbers together, 
and define a function f, + f2 by 


(fi + f(x) =filtx) + fo(x) for all xeA. 


The function f, + fg so defined is also a function from A to R. 

In a similar way, using the fact that real numbers can also be mul- 
tiplied by one another and not only added to one another, we define 
for every function f from A to R and an arbitrary but fixed real number 
k a function kf by 


(kf\(x) =kf(x) for all xe A. 


The function kf is then also a function from A to R. 

Since the functions f, +f, and kf are functions from A to R, they 
belong to the set F(A,R) of all functions from A to R. Thus we observe 
that by using the above rules we can add together two arbitrary funct- 
ions belonging to F(A,R), and get a function which belongs to this 
same set F(A,R). And we can multiply any arbitrary function belonging 
to F(A,R) by an arbitrary real number k, and get a function which 


26 


CHAPTER I 


belongs to F(A,R). This is expressed by saying that the set F(A,R) of 
all functions from A to R forms a function space. 

If fi.fo,-->f, are n functions from the function space F(A,R), the 
function f defined by 


f= afi t a2fot -. + Onha 


where @),do, . . ,a, are real numbers, is the /inear combination of fi, fo, - - Sn 
with the weights a,,do,..,a,- Of course such a linear combination is 
to be understood as the function whose values are given by! 


f(x) = a fix) + aofo(x) + .. +a,f,(x) for all xe A. 


All the linear combinations of given n functions fi, fo,..,/,, form 
an w-dimensional /inear space L,, 


(2) L,=sf=afit afot..+ anf, 3 ae R}, 


provided that no one of the functions /,,fo,..,/, is a linear combination 
of the other ones. The functions fi,fo,../f, are then said to be J/inearly 
independent of one another, and form a basis system in the subspace L,,. 

A function 9 from a function space F(A,R) to the set R of real numbers 
is called a functional defined on F(A,R). It is a linear functional if 


p(af + bg) = ae(f) + beg) 


for any two functions f and g from F(A,R), and any given two real 
numbers a and b. 


Fig. 4. A functional ? on the function space F(A,R) 


1. For an infinite n there are other possibilities too, for the association of f with 
a,f,'+ .. + anfn, leading to different topologies in function space. We shall pass 
by the topological problems of an infinite-dimensional function space here. 


27 


CHAPTER I 


A function » from the cartesian product space F(A,R) x F(A,R) 
to Ris a bilinear functional on F(A,R), if it is linear in both of its argu- 
ments: 

p(af + bg, hh) =ae(f,h)+ 6 9(g,h), and 
o(f,ag + bh) = ae(fg)+ 5 e(fh). 


Here f, g, and ft are any three functions from F(A,R), and a and b 
are two arbitrary real numbers. A bilinear functional which 1s symmetric, 
that is, 


9 (6g) = 9(8,f) 
for all figeF(A,R), defines an inner product in the function space. The 
inner product of two functions f and g is denoted by </f,g>. Accord- 
ingly, we can write 

2 (4g) = <fig>: 
if » is a symmetric bilinear functional. 
A inner product defined on F(A,R) is positive-definite, if 

<f,f>>Oforallfe F(A,R)exceptthat<f,f>=Oforf 0, 

negative-definite, if 


<ff>< 0 for all fe F(A,R) except that<f,f>=Oforf .-0, 


and indefinite otherwise. A positive-definite, a negative-definite, or an 
indefinite inner product is said to define in the function space F(A,R) 
a positive-definite, a negative-definite, or an indefinite metric, respect- 
ively. 

In a positive-definite metric we call the square root of <f/f> the 
norm of the function f, and denote it by 


.= V<ff>- 


The norm of the difference f—g is called the distance between the 
functions f and g. 


The norm so defined is always positive or zero. Writing the norm 


square of af++g, where a is a real number and f and g two functions 
from F(A,R), we thus always have: 


laf+ el? = <aft+g,afte> =@</f, f>+<g, g> +2a<f,g> 20. 


The discriminant of the polynomial gives at once the inequality of 
Schwartz, 


(3) I<f-g> |s[s|- |e] for any fige F(A,R), 
which is thus valid in a positive-definite metric. 


28 


CHAPTER I 


4 / Vector Space 


All that was said above on the function space F(A,R) has simple geo- 
metrical interpretation, if the basic set 4 is finite. If A has a finite number, 
n, of elements, we can write: A = {x,,x2,..,X,}. The function space 
F(A,R) is in this case called an n-dimensional vector space V,: 

F(A,R) = V,,. 


Obviously, every function f from A to R is now represented by the 
combination of the 2 values f(x), f(%2), ..,/(x,). Thus we can write: 


It — (f(xy; f (2), o St (Xn): 


Such a function ts called an n-component vector which has the com- 
ponents f(x,), f(x2),..,f(x,). The elements f of the vector space V,, 
thus are m-component vectors. 

We can easily show that the vector space V, is an n-dimensional 
linear space. Indeed, a basis system in V,, is given by the n vectors 
defined by 


fA, = (1,0,0, ..., 0), 


Ii = (0,00; :225, 1. 
Any vector fe V,, can be expressed, obviously, as a linear combination 


f=4fitafot.. ait ay 


where the weights are given by 


Qa, = f(x), ae = f(X2), -+394y = f (Xn): 


Secondly, the vectors fi,fo,..,j/, are linearly independent of one an- 
other. Thus V,, is an n-dimensional linear space. From the last equations 
we see, furthermore, that an n-component vector fe V, is represented 
by the combination of the weitghts: 


SF = (1,02, . «5 An). 


The addition of functions becomes in a vector space V, an addition 
of vectors. If f = (a,,@.,..,a,) and g = (0,,b2,..,b,) are two vectors 
of V,, their sum vector is given by 


S+8 = (a, + by, ay + be, .., Aq + b,). 


29 


CHAPTER I 


The scalar multiplication of a function becomes in a vector space V, 
a scalar multiplication of a vector: 


kf — (kay, kao, eey ka,) 


defines the vector kf for any ke R and fe V,,. 
The weights a,, a,,..,@, of the vectors fe V, in the basis system 
his Sey» + Sn are all linear functionals on V,. This is seen by writing 


f=al(ffitafppt+..+al(/h 


and using the rules of vector addition and scalar multiplication which 
give: 

a(f+g) =a(f)+a(g) forany fgeV,, and 

a(kf) = ka(f) forany feV,;i = 1,2,..,7. 


We can define a bilinear functional » on V,, by writing 
(8) = ab, + dobe + .. + andy; 
where f=afitaf.t...+af, and g=)fi,tbfet... + biSh.- 
This is symmetric, since 


9(f,8) = a,b, + agb, + ...+4,b, = bya, 4- byay. . =- 
7 bnQn = 7(g,f). 


It 1s also positive-definite, since 


¢6S) =ajtazt+...+ai>0 


except for the case a,;=a,= .. =a,=0. 


Accordingly, this bilinear functional defines a positive-definite inner 
product in the vector space V,: 


<fig> = a,b,+4a.b.+ ..+,),. 
In this positive-definite metric the norm of a vector f becomes 
If] =V<te> =Vaitait .. +43, 
and the distance between two vectors f and g becomes 
|\f—g| = V(a,—b,)?-+(a,—b,)*+ .. +(a,—b,)*. 


This positive-definite metric in the vector space V, is called the 
Euclidean metric. The vector space V, where such a metric is defined 
is called an n-dimensional Euclidean space E,,. 


30 


CHAPTER I! 


An n-dimensional Euclidean space is simply a generalization of the 
3-dimensional space geometry studied in school mathematics. The 
3-dimensional Euclidean space £, represents the ordinary geometry 
of our 3-dimensional physical space. The components (a;,d2,a3) of 
a 3-component vector f are, in geometrical illustration, the coordinates 
of the ’point’ f. The norm [| is the ordinary Euclidean distance” 
of the point f from the origin of the coordinate system. The formula 


9 2 = 2 
S| = 4 +4,+4;, 


is simply the Pythagorean theorem when applied twice, first to get 
the length of the projection } of the vector f on the place of the coor- 
dinates (a,,a.) (sce Fig. 5), and then to get the length of the vector 
f from the triangle on the plane of 6 and a3. 


Fig. 5. The norm and the Pythagorean theorem: vA = 5* + a} and b* = a? + a}. 


In the geometrical illustration of the Euclidean space E, each vec- 
tor f = af, + aofo + asf,3eL3 is represented by the point having the 
coordinates (a,,a2,a3), or by the arrow pointing to this point and starting 
from the origin. Let us study the geometrical illustration of the Euclidean 
inner product, by using two such vectors f = a,f, + defo + a3f3 and 
& =) fi, + bof, t sf. 


31 


CHAPTER I 


Let the angle between vectors f and g (between the arrows represent- 
ing these two vectors) be $. Then the projection of the vector f along 
the vector g has the length | f| cos 9, as is evident from the triangle on 
the plane of f and g, shown in Fig. 6. On the other hand, by the 
projection theorem we learned at school, we get the same projection 
of f on g also by summing up the projections of a, fj, dofo, and a3f3 on 
the vector g. If «, is the angle between g and f;, the projection of a,/f; 
along g has the length a,cosa,. Denoting the angles between g and 


fo, and g and f,, respectively, by «, and a3, we get by projection theorem 
the equation 


lf | COS $ = a,COS«, + a,COS«, + a3COS%3. 
Multiplying by |s| we get 
f| : |g| cos} = a,|g| cos a, + a,|g| COS % + as| g | COS %3. 


But \g| cos a, 1s the length of the projection of g along f, so that |e | 


cos a; = b. In a similar way we get |g|-cos «, = b, and |g|cos x, = by. 
Thus: 
(4) f| . |g| cos 9 = a,b, + ash, + a3b3 = <fig>. 


This formula gives the geometrical interpretation of the Euclidean 
inner product in an Euclidean 3-dimensional space E3;. According to 
this interpretation the Euclidean inner product of two vectors f and g 
is the product of the lengths of the vectors f and g times the cosine of 
the angle 9 between these vectors. 


Fig. 6. Geometrical illustration of the Euclidean inner product by means of the 
projection theorem: < fig > =|f| - |g| cos @ = a.b, + aby + ays. 


32 


CHAPTER I 


We can generalize the last formula to an n-dimensional Euclidean 
space E. by writing 


<fig> = |/| : |g| cos 9 for any f, gceE,, 


thus defining the angle > between two vectors f and g in an n-dimensional 
Euclidean space E,. The condition of orthogonality of two vectors 
f and g, cos }=0, then preserves its meaning even in an n-dimensional 
space: 

<fig> =0 or cos? =0 

means the orthogonality of f and g. 


5 / Mathematical Relation 


Observation of the specific proportions in which the elements of some 
sets are with one another does not always mean, even though it often 
does, the observation of some functional relationships. A more general 
expression for such a specific proportion is mathematical relation. 

Let us first introduce mathematical relation as a_ generalization 
of function. If fis a function from a set A to a set B, this means that 
we are given a set {(x, y)}} CA x B of pairs (x, y) where x represents 
an argument and y the corresponding value of the function f. In each 
pair the first member, the argument x, determines uniquely the second 
member, the value y. In fact, a function f does not mean anything but 
a particular rule for picking out the accepted pairs (x, y) from all the 
elements of A x B. Thus we can write 


f=f{xy)} ¢ AXB 


indicating so that the function f is particular rule for picking out the 
elements (x,y) of the set f{(x,y)}. 

In the case of a function the set {(x, y)} of the elements picked out 
from A x B obeys the further condition that in each pair the first mem- 
ber x determines uniquely the second member y. If we give up this con- 
dition we have a general case of a mathematical relation defined in 
the set A x B. Accordingly a mathematical relation defined in AXB is a 
subset R{(x,y)} of the elements (x,y) of A xB which are picked out by 
means of a particular rule R from the set AxXB: 


R{(x,y)} <¢ AXB. 


The word “relation” sometimes refers to the set R{(x,y)} itself, and 
sometimes to the rule R used for picking out the elements of R{(x,y)}. 


3 — Cybernetic method... 33 


CHAPTER I 


A mathematical relation of the type just discussed is a two-member 
relation. We can at once generalize to the case where there are m members 
in a relation. We define an m-member mathematical relation in a set 
A,X..XA, aS a subset composed of some elements (x,..,x,) of 
A,X..XA,,: 

(5) R{(xy, - - X_)} © AX.» KX Ag. 


This definition can be further generalized by considering, instead 
of a sequence of the elements x; « A; some subsets ¥;< A;. Then we get 
an n-number mathematical relation defined in the set F(A,) x .. « F(A,,): 


R{(X,, .. .Xq)} © F(Ay) x .. X F(A,)- 


In a general case, just as in the case of two-member relation, we may 
refer by the word relation” either to the set R{} or to the rule R for 
picking out the elements of the set R{}. 


6 / Invariance and Transformation 


Obviously a mathematical relation is a very general notion in terms 
of which we can express a great variety of specific proportions that 
we observe in reality. Once we have observed such proprotions, and 
constructed the corresponding mathematical relation, we are often 
confronted with the problem of the invariance of the observed relation 
with respect to some transformations. The problem of invariance and 
transformation can be represented mathematically as follows. 

Let us consider a function f from a set A onto itself. Such a function 
means just a permutation of the elements of A with one another, and 
has always an inverse f~!. Let us call this kind of function a trans- 
formation in the set A. 

A transformation f in the set A also induces a permutation of the 
subsets of A with one another. If Ay is one of these subsets, let us 
write f(A,) for the subset which is obtained from A, by the trans- 
formation f, i.e. f(Ao) = {f(x);x € Ao}. 

If we have a mathematical relation R{(x,,..,x,)} defined in the set 
AX..XA =A", we call this relation invariant with respect to the 
transformation f, if 


R{(x,, ae »Xn)} ae {(f(xy, aa S(%n))3% a »Xn) € R}. 


In other words, a relation R is invariant with respect to f, if the function 
Jf merely permutes the sequences (x,,..,x,) included in R{(x,,.. ,X,)} 
with one another but does not change the set R{(x,,..,x,)} itself. 


34 


CHAPTER I 


The invariance of a relation R{(X,,..,X,)} holding between some 
subsets of A can be defined in a corresponding way: 


R{(%, es X,)} = {f(%), ous (Xn) , (X, oe Xn) c R}. 
If we write y; = f(x;) and Y; = f(X,) we can write the last two equa- 
tions in a form 


R{(x,, o Xn)} == R{Or a vas Ji = f(x}, 
R(X, 48 X,)} = R{(%, s nh Y; = f(X})}. 


But since y = f(x) defines a one-to-one correspondence between the 
y-elements and the x-elements, and since this same correspondence 
is committed by the inverse function x = f—(y), we can write the last 
two equations in another form 


Ry» Yad} = RECs «Xai M1 = SAO). 
RNY) = Rao Xe a Oh: 

These relations can be further written as 
R{Orv - Yad} = (F710, «SL On))s Ons - + Yn) © RE, 
RUN, - Yad} = (PDs - LAM) Vas» « Yn) © R}, 


which show that the relation R is invariant also with respect to the 
inverse transformation f~}. 

We can proceed a little further by showing that if f and g are two 
transformations defined in A, and if the relation R is invariant with 
respect to both f and g, then R 1s invariant also with respect to the 
composite transformation fg (and gf). Indeed, if / merely permutes 
the elements of R with one another, and if g is another permutation 
of these same elements, then gf (and fg) is just a permutation composed 
of the successive permutations f and g, and thus leaves certainly the 
set R invariant. This can be easily checked by formal calculation in 
a similar manner as we proved above the invariance of R with respect 
to f-). 


If we have a set G = {f} of transformations defined in A, such that 
1° the identity transformation 1 belongs to G, 


2° for each transformation feG its inverse f—1 also belongs to 
G, and 


3° for any two transformations feG and geG also their com- 
posite transformation fg (and gf) belongs to G, 


the set G is called a group of transformations. 


35 


CHAPTER I 


We have above shown that a mathematical relation defined for the 
elements (or subsets) of A, and invariant with respect to some trans- 
formations f in A, is always invariant with respect to a certain group 
G of transformations. In fact the notion of invariance and the notion 
of a group of transformations are equivalent. Once a group of trans- 
formations G is given we can identify this group by giving mathematical 
relations which are invariant in the transformations of the group. These 
relations are the invariants of the group G. Vice versa, if some mathem- 
atical relations are given, we can ask for the group of transformations 
G in which these relations are invariant. This group sometimes reduces 
to a set containing only the identity transformation |, which is of course 
a trivial case from the point of view of invariance. A non-trivial case 
is in question when the group G, called the invariance group of the 
relations in question, does not reduce to mere identity transformation. 


7 | Algebraic Operation 


We shall now study what is meant by an algebraic operation. An alge- 
braic operation is a particular kind of function, viz. a function «x from 
a set AXA to the set A itself: 


a:AxA—-A, 
(x,y) > z, 


where x,y, and z are elements of the given set A. 

Accordingly, when we have a given set A, we can define an algebraic 
operation « by associating with each pair (x,y) of the elements of A 
one and only one element z = a(x,y) of the same set A. 

An algebraic operation « is called associative if it can be continued 
to a function «* from AXAXA to A in a uniquely determined way, 
1.e. If 

a*(x,y,z) = a(a(x,y),z) = a(x,a(y,z)). 

We observe that an associative algebraic operation associates with 
any sequence (x,y,z,u,..) of elements of A one and only one element 
of A. This element depends only on the elements x,y,z,u,.. and of 
their mutual order in the sequence. 

An algebraic operation « is called commutative if its value is indep- 
pendent of the order of the arguments, i.e. if 


a(x,y) = a(y,x) 


for any given elements x and y of A. 


36 


CHAPTER I 


If we have two algebraic operations « and 8 defined for the same 
set A, and if at least « is commutative, we say that § is distributive with 
respect to «, if 


B(x,a(y,z)) ax a(B(x,y), B(x,z)) 


for any elements x,y, and z of A. 
Let an associative and commutative algebraic operation a be defined 
in a set A. If there is an element x2 of A, such that 


a(x0,x) = 
for any element x of A, the element x® is called the zero element of the 
operation x. If, for each element x of A, there is an element x;' such 


that 
w(x,xz!) = x8, 


the element xj! is the inverse of x with respect to the operation «. 

A set A, in which an associative and commutative algebraic operation 
a is defined so that the zero element x® and the inverses x, are 
uniquely determined, is called an Abelian group. An example of such 
a group is the set of all integers where addition satisfies these requir- 
ements. Indeed, the addition associates with any pair (n,m) of integers 
another integer, viz. the sum n+ m. The addition is associative, since 
(2-+n)-+k = n+-(m-+k) =n+m+k, and commutative since n+m = 
= m+n. The zero element of addition is 0, and the inverse of an integer 
n with respect to the addition is —n. 

Another example of an Abelian group is the set of all rational numbers 
with respect to multiplication. Indeed, multiplication associates with 
any pair (r,,r.) Of rational numbers another rational number, viz. the 
product r,r.. The multiplication of rational numbers is associative, 
since (r,re)r3 = 1r,(rer3) = 1rer3, and it is also commutative since r,r, = 
=rar;. The zero element of multiplication is 1, and the inverse of a 


rational number r with respect to multipilication is the rational number 
| 


r 
A set A in which an associative and commutative algebraic operation 


a and an associative algebraic operation 8, which 1s distributive with 
respect to «, are defined, is called a ring provided that the zero element 
x2 and the inverses xz! are uniquely determined. The operation « 
is called ring addition, and the operation 6 ring multiplication. A ring 
may or may not have a zero element xB or the inverses xg? with 
respect to B-operation. The $-operation may or may not be commutative. 
In the former case the ring is called commutative. An example of a com- 


37 


CHAPTER I 


mutative ring is the set of all integers, when a is defined by addition 
and 6 by multiplication. There is a zero element xg, viz. 0, and a 
zero element xB, viz. 1, in this ring but there are in general no inverses 


x85 since the inverse : of an integer n with respect to multiplication 
is not an integer itself (except when n = 1). 

A ring where even the element xB and the inverses xg’ exist and 
are uniquely determined is called a field of numbers. In a field of numbers 
the element x2 is called zero, and the element xB is called the unit. 
In a field of numbers we can speak of four fundamental operations, 
which are called and denoted as follows: 


addition: a(x,y) = x+y, 
subtraction: a(x,y,') = x—y, 
multiplication: B(x,y) = xy, and 


division: B(x,yg!) = 5 ; 


An example of a field of numbers is the field of rational numbers, or 
the field of real numbers. 

In the preceding sections we have already met an algebraic notion 
which we shall now define in more general terms. 
A set A is called a linear space over the field K of numbers, if 1° A is 
an Abelian group with respect to an operation «, and if 2° a commut- 
ative algebraic operation y from KxXA to A is defined, y being distri- 
butive with a: 


y: KxA—>A, 

(kx) Sy = r(k,x), 

r(k,x) = r(x,k), 

v(k,a(x,y)) = a(y(k,x), r(k,y))- 


The operation y is called scalar multiplication, and denoted by y(k,x) = 
= kx. Accordingly, we have in a linear space two algebraic operations, 
viz. addition and scalar multiplication. An example of linear space 
over the field of real numbers is the function space F(A,R) studied in 
earlier sections. 

A set A is called general algebra if it is simultaneously 1° a linear 
space over a field of numbers, and 2° a ring, and if 3° the ring multi- 
plication is commutative with the scalar multiplication. A trivial example 
of general algebra is the set of all real numbers. We shall soon meet 
a less trivial example, viz. matrix algebra. 


38 


CHAPTER I 


If a number of algebraic operations, «,,..,%,, is defined in both 
the sets A and B, the set B is said to be homomorphic to the set A if 
there is a function f from A to B so that the algebraic operations are 
invariant: f{(«;(x,y)) = «,(f(x),f(y)) for every operation «,;. If such a 
function f is one-to-one, the sets A and B are said to be isomorphic 
to one another. Thus isomorphism and homomorphism are algebraic 
notions (while the *>homeomorphism’ to be studied in 5§ is a topological 
notion). 


3 $ Matrices 


1 / Matrix Algebra 


A matrix is a table of numbers containing, say, n rows and m columns. 
If the number belonging to the /' row and the k' column is denoted 
by a,. we can write such a matrix as follows: 


Qi Qyo re ee er Sr Qim 
as» Qoao er ey Qo . 

a “" 1 = annxm matrix. 
ani ane oee ef ec e we we woe ew wee ee Qnm 


As indicated above a matrix containing n rows and m columns is re- 
ferred to as an "1 Xm matrix”. The numbers a,, or the elements of 
the matrix, may be taken from any fixed field of numbers. 

We shall restrict ourselves first to square matrices, i.e. to matrices 
where are as many rows as there are columns: n = m. We shall also 
restrict ourselves to real matrices, i.e. to matrices whose elements are 
real numbers. 

Let M(nxn) be the set of all real nx matrices. We shall show that 
this set forms a general algebra with respect to certain algebraic operat- 
ions «, 8, and y. To show this we must show 1) that M(n x7) is a linear 
space over the field R of real numbers with respect to an addition « 
and a scalar multiplication y, 2) that M(nxn) can be extended to a 
ring by defining a ring multiplication 8, and 3) that B and y are commut- 
ative with one another. 

For the first task we must show a) that M(n Xv) is an Abelian group 
with respect to an addition operation a, and b) that a scalar multiplicat- 
ion y can be defined as a function from RX M(nxXn) to M(nxXn). 


39 


CHAPTER |! 


For this purpose we define an addition «(A,B) = A+B of any two 
matrices AeM(nxn) and BeM(nxn) by 


Qy; Qy3- -Q), by, Dye. « 54, Ay +8 Ayot Oye - Aint Ory, 
Qe; Azo - -Ao, Ae boy Bap. « boy - Ao, +be, Aoe+bo2 . - ont Son 
Qn ane ann bn bane ate Ban An tn, an2+ One as an ag 
a) 

A B A+B 


In other words, the sum A+B of the matrices A and B is obtained by 
adding together the corresponding elements of the matrices A and B. 
If the element in the j'" row and the k™ column in the matrix 4+ B 
is denoted cy, we thus has the following rule: 


Ck = Uz + bx: 
Since 


(yD) Cp = Apt (OAC) = A+ OAK, 
and 


Ay + Dy = bit OK: 


the addition of matrices so defined is associative and commutative. 
Thus we can write (A+B)+C = A+(B+C) = A+B+C and A+B -- 
= B+A. The zero matrix 0 can be defined as a matrix all whose elements 
are zero. The inverse of a matrix A with respect to addition can be 
defined as a matrix —A composed of the elements —a,, the a, being 
the elements of the matrix A. The set M(nxn) of square matrices 1s 
then an Abelian group with respect to the addition. 


A scalar multiplication of any matrix A of M(nx7) by a real number 
rcan be defined by 


Qny Ano - +» Ann TAnt TAn2 + + TAnn 
se eed —_—_—_—_—_—_——_——_—_—_— 
A rA 


Since ray, = a,r this operation is commutative, and since r(a,+5,%) = 
= ray,+rby, it is also distributive with matrix addition. Thus we can 
write rd = Ar and r(A+B) = rA-+rB. Accordingly, the set M(n xn) 
is a linear space over the field of real numbers. 

The second task is to construct a ring multiplication 6 such that 
M(nxXn) becomes a ring with respect to the addition « and the ring 


40 


CHAPTER I 


multiplication 8. For this purpose we define a matrix multiplication 
8(A,B) = AB of any two matrices AeM(nxn) and BeM(nxn) by the 
following rule: 


Cik = >. Didi: 


i=t 
where cy, is the element of j'" row and the k' column in the matrix 
C = AB, and a, and 6, are the corresponding elements in the matrices 
A and B, respectively. In words: we get the element cy, of the matrix 
AB by computing the “product sum” between the j row of A and 
the k' column of B, as indicated below: 
y Y 


Qy1 Ay2 ++ A1_ by; bye. - Oi, Cir Cia + + Ci, 
—* f Go, Gao.» Aan bo ben -- Ban ff C01 Coe» - Con | 
Mn Uno -- Qnn bay bye Ban Chi Cn2 ane Can 
—_——_——_—_—_———_ —_— Ye 
A B AB 
For instance, if we have two 3x3 matrices 
1 3 2 21 =) 
A-—|[2 1 I}andB= 13 1 3], 
0 2 1 1 2 3 


then the element c,, of C = AB is obtained from the elements of first 
row of A and first column of B in the following way: 


Cy = 1° 24+3°342°1 = 13. 


The element c,,. is obtained from the first row of A and second column 
of B in a similar way: 


Cio = 1-1+3°-142-2 = 8. 


= (2 abi) Cs = > ay (2 buce!) = > >, QO yCxls 
k\i i k ik 


the matrix multiplication is associative. Thus we can write (AB)C = 
= A(BC) = ABC for any three matrices A, B, and C from M(nxn). 
It follows from this that we have a definite matrix A,A,..A, as the 
product of any matrices A,,..,A, from M(nxn). 

However, since the ’product sum” of the j row of A and the k™ 
column of B is not necessarily the same as the ”product sum” of the 
j" row of B and the k'* column of A, the matrix multiplication is 
not commutative. Thus in general AB and BA are different matrices. 


Since 


4l 


CHAPTER I 


But commutativity is not a necessary condition for ring multiplication. 
Since 


> Gi (bu tcin) = > Gube + Dd Qin, 
i i i 


the matrix multiplication is distributive with respect to matrix addition. 
Thus we can write A(B+C) = AB+AC. 

It follows from the last results that M(x») indeed is a ring with 
respect to the matrix multiplication and the matrix addition defined 
above. It remains to be shown that matrix multiplication is commutative 
with scalar multiplication of matrices. We have to show that rAB = 
= ArB = ABr for any two matrices A and B from M(n xn) and for 
any real number r. But this is evident, since 


r > Gidis = > anrbiz = >. Gj dixr. 
i i i 


Thus we have completed the proof that M(x») is a general algebra 
with respect to the operations of matrix addition, matrix multiplication, 
and scalar multiplication of matrices defined above. 

If we relax the conditions of general algebra we can define algebraic 
operations for non-square matrices. For instance, we observe that the 
set M(nxm) of all 1xm real matrices is a linear space over the field 
of real numbers. Indeed, we can at once generalize the addition defined 
above for square matrices to this non-square case. Thus we can add 
up any two Xm matrices A(m xm) and B(n xm). We can also extend 
the scalar multiplication to these matrices so that any nm matrix 
A(n Xm) can be multiplied by a real number r to get the matrix rA = Ar. 

We observe also that a product A(mxm)B(mxk) can always be 
computed for an nXm matrix A(nXm) and an mxk matrix BO xk), 
when n,m, and K are arbitrary positive integers. 

A matrix is often denoted by indicating its "general element”, 1.c. 
the element of, say the j'" row and the k'® column. The matrix is 
then written as a collection of these elements: A = lanl. 


2 / Linear Transformation 


The usefulness of the matrix algebra M(n xn) is due to the equivalence 
of nXn matrices with linear transformations in an n-dimensional vector 
space V,,. 

We have (see p. 29) introduced the vector space V, as a function 
space F(A,R), where the basic set A has n elements x,,X2, .. ,X,, nm being 
a finite number. Each function f from A to R is a combination of the 


42 


CHAPTER I 


n values /f(x,),f(%2),..,f(x,). Denoting these values by /(x,) = 4,, 
S(X2) = dz, ..,f(X,) =a, Wwe could express each function f as a com- 
bination f = (a,,..,qa,) of the » real numbers 4a,,@.,..,a,. We shall 
now write the components of this n-component vector f as a column, 
and denote this column by x. So we have the one-to-one correspondence 
between all the vectors fe V, and the nx1 matrices x e M(nx 1): 


ay 
ag 
S = (41,43, . . ,a,) <—> x = : (one-to-one). 


a 


n 


We can even define the addition of vectors, and the scalar multiplicat- 
ion of a vector, by the respective matrix operations: 


| a,t+b, | ka, 
Qy-t+-bo ka, 

x-+y ° and kx = : 
a, +b, ka,, 


where x and y are the columns composed of the numbers 4aj,dp, . . ,a, 
and 4,,b6.,..,5,, respectively. Then we can represent the base vectors 
Sivfo,--Sn by the respective columns ¢@,,€.,..,e€, aS follows 


l 0 0 
0 1 0 
0 
h<e= ° ,io<— &, = s , Sn<— ee, = 
0 0 l 


Now the representation of the vector fas a linear combination f = a,f,+ 
+a.fo+ ..+a,f, becomes expressed by 


x= Q,€;+Qslo+ es +anen- 


Accordingly, all the vector operations in a vector space V, can be 
translated to matrix operations with the n x 1 matrices of the set M(nx 1) 
in a unique way. The latter set of matrices is thus a one-to-one matrix 


43 


CHAPTER I 


representation of the vector space V,,. We shall call it the represent- 
ation of V, by "column vectors’, and denote it by VW,,: 


M(nx1) = W, <— VJ, (one-to-one). 


Another matrix representation of the vector space V, is given by 
the matrix set M(1 xn). From every column vector x we get an element 


of M(1 xn), or a "row vector”, just by writing the components of x 
along a row: 


x<—x’=(4,q..... a,,). 
Even this correspondence, of course, is one-to-one. Matrix addition 


and the multiplication of a matrix by a scalar again give the corre- 
sponding vector operations. The base vectors are now given by 


e =(1 0 0... 0), 
e =(0 10... 0), 
e =(0 0 0 1) 


The representation of the vector space V, by the row vectors x’ 
e M(1 Xn) is called dual to the representation W, by the column vectors. 


The vector space spanned by the row vectors is called dual to the space 
W,, and denoted by W;: 


M(1xn) = W,<-— JV, (one-to-one). 


Let us now consider the matrix algebra M(nxvn). Let A be one of 
its elements, that is, a matrix composed of n rows and n columns of 
real numbers. (Of course, this A should not be confused with the basic 
set of the function space F(A,R) which was denoted by the same letter.) 
The matrix product of such an nXn matrix A and an n-component 
column vector x is again an m-component column vector. Let us denote 
it by y: 


b; Qj) Qy2 Tip, a1 
b, Qo, Q22 Fen a2 
y= ol Me a a eee er ae or a ’ 
b, | 2m Gn2 Qnn _l an 
ee ee ——~ 
A x 


CHAPTER I 


When written explicitly for the components of y this gives, according 
to the rules of matrix multiplication: 


b, =; 1,4, + Q124e7+ os +418), 
by = Ag3Q,+Ag.Ggt .- +4anQn, 


b, = Any Ay + Aygdg7 +» Anny, 


When every vector x « W,, is multiplied by the same m Xz matrix A in 
this way, we get all these vectors transformed to the corresponding 
y-vectors within the space W,. Thus such an nxn matrix A determines 
a transformation of the vector space W,, to itself. We call it a linear 
transformation of this space. 

The above equation written explicitly for the 5,,5,,..,5, can be 
expressed in matrix form either as y = Ax or as y’ = x’A’. Here A’ 
is the transpose’ of A, obtained from the matrix A by writing the 
rows as columns and vice versa. Thus to every linear transformation 
x —> y= Ax in the space W,, there corresponds the linear transforma- 
tion x’ —> »” = x’A’ in the dual space W;,;: 

XN—my=Ax <> VO y = N'A’. 

From the equation y’ = (Ax)’ = .x'A’ we see that (Ax)’ = x’'A’. 

This rule can be generalized to any matrix product AB so that we have 


In general: 
(AB) = BA’. 


Let us now study under which conditions a linear transformation 
x—>y = Ax, or, what is the same, x’ — y’ = x’A’, is one-to-one. The 
domain D, of the transformation A is, of course, the vector space W,,: 

Dyz= W,, = {x; x =ae,+.. +a,e,; a,€ R}. 
The range of the transformation A, on the other hand, is 
Ry = (ys y = Ax; xeW,} © W,,. 


We have to study under which conditions Ry = Dy. 
For this purpose we can express y in terms of the base vectors 
£1,005» » sn! 


y = Ax = A(aye,+ .. +a4,e,) = a,(Ae,) + .. +a,(Ae,). 


Evidently, if the transformed base vectors Ae,,Aés,..,Ae, are linearly 
independent of one another, the vectors y, obtained when the weights 
Q},d,..,@, run over all real numbers, cover the whole space W,,. In 


45 


CHAPTER I 


this case we thus have R, = W, = Dy, so that the transformation 
A is one-to-one. 

What are the transformed base vectors Ae,,Aé,,..,Ae,? Performing 
the matrix multiplications we find that they are the column vectors 
formed by the columns of the matrix A: 


Qyy Qyi2 QQ), 

Qo Qo2 Qo, 
h, = Ae, = . ) h, => Ae, => * gece ih, = Aeé,, = 

Ant ano a 


2 an 


The linear independence of these vectors of one another or, as we can 
say, the linear independence of the columns of the matrix A of one 
another, is thus the necessary and sufficient condition for the trans- 
formation determined by the matrix A being one-to-one. 

If we consider, instead of the transformation x — y = Ax the dual 
transformation x’ — y’ = x’A’ we find out that another condition, 
equivalent to the previous one, is that the rows of A must be linearly 
independent of one another. These two conditions are the same, and 
are expressed together by saying that the rank of the matrix A 1s n. 
Such a matrix is called non-singular. 


3 / The Inverse of Matrix 


Every non-singular nxn matrix A has of course an inverse, that 1s, 
there is a uniquely determined nxn matrix A™! so that AA7! = 
=A1A4 = 1. Here 7 is the unit matrix 


1100... 0 
010 0 
T=10 01 0 
000 1 


To show how the inverse matrix is calculated we first define, for 
any mXm matrix B, a real number called its determinant and denoted 
by |BI. It is called a determinant of the m™ order. The definition and 
the calculation of a determinant of m'" order happens by means of 
a recursive formula which reduces each determinant of m'" order 
to a weighted sum of determinants of (m—1)" order. This kind of 


46 


CHAPTER I 


recursion determines uniquely the value of any determinant, for we 
make the further convention that the determinant of an 1x1 matrix 
containing only an element ae R is the number a itself: | a| = a. 

The recursion formula is 


(6) \3| = > (—1)!**by Bul, with fixed j, 
| 


where 5, is the element of the j'" row and k'® column of the matrix 
B, and | B,.| is the determinant of the (k—1)x(k—1) matrix obtained 
from the matrix B when the j'" row and k' column are eliminated: 


Since the order of the determinants | By | is m—l, this formula gives 
the desired recursion. 

Let us study how the recursion works in practice. Let us calculate 
the determinant of a 3x3 matrix 


a | 
B= [2 1 3 
3 1 | 


We can choose any row whatever to be the j'" row which is the basis 
of the reduction. Let us choose j = 1. Then the first application of the 
recursion formula gives: 
2. 3 
— (__])1+1.1]. 
B =(—1) l | 3] 


I 3 + (—1)!t2-2- 
1 ] 
2 1 
__yyits.7. 
repre 
Applying the same recursion then for the remaining determinants of 
the second order we get: 


+ 


) = (—1)1* 8+ 1-1+-(—1)'F?-3- 1 = 1-3 = —2, 
2 3 it+1 1+2 
2 | 1+1 1t+2 
3 1 = (—1)!T1-2-1+(—1)!*?-1-3 = 2—3 =—1. 


47 


CHAPTER I 


Thus for the determinant |B| we get the result 
[B] =(—1y tt 1 (2) + yt? 2-7) + 
(Cay Heo oie Sa. 


Useful formulae holding for determinants are 
|4'| = [4] and [43] =|] - [3]. 
Now the inverse of a non-singular nxn matrix A is simply given by 
Al = [|(A-jl|, where 
_y _ (-1)/**]A,| 
(7) (A) jn = ei eek 


Notice the reversed order of subscripts in the sub-determinant: 
Ay versus (Aq!),. 


We see from the formula that in order that A™ exists the deter- 
minant |A| must be different from zero. This is indeed a necessary 
and sufficient condition for the non-singularity of A. Accordingly, 
we have three mutually equivalent conditions of non-singularity of A: 
the linear independence of its rows, the linear independence of its 
columns, and the non-vanishing of its determinant. 


4 / Eigen Values 


We can now go into a deeper analysis of the metric of a vector space. 
Let us represent the general vector space V, by a column vector space 
W,. Then, obviously, every bilinear functional 9 on V, is represented 
in W,, by a matrix function as follows: 


(8) (x,y) = x'Cy. 


Here C is an mXn real matrix, and x and y are two column vectors 
from W,, x’ being the row vector formed as the transpose of x. In 
particular, the symmetry condition of » becomes 


C’ =C. 
Such a matrix is called symmetric. Thus any symmetric nxn matrix 
can define a metric in an n-dimensional vector space. When a metric 


is defined by such a matrix C we can call this matrix the metric matrix 
of the vector space in question. 


48 


CHAPTER I 


Let us study the inner product defined by 
<xy> =x'Cy 


in the vector space W,. In a linear transformation determined by a 
matrix A the vector x is transformed to Ax and the vector y to the vector 
Ay. Thus the inner product between x and y is transformed as follows: 


< x,y > =x'Cy-4(Ax)'C(Ay) = x'A'CAy. 
Thus the metric matrix C has been replaced by the matrix A’CA: 
CLACA, 


We apply now, without proof, the theorem according to which "every 
symmetric real matrix C can be transformed by an orthogonal trans- 
formation A to a diagonal matrix D”. The transformation determined 
by the matrix A is orthogonal when 


AA’ = A’'A =I so that A’ = AW. 


The diagonality of D means that D is of the form 


d, 0 0 0| 
0d, 0. 0 
D=|00 4d 0 


Thus, by the above theorem, we can write 
(9) A'CA = D, 


where A is an orthogonal transformation, 

According to the above theorem we have the result that every metric 
in a vector space W, can be expressed, after a suitable orthogonal 
transformation, by a diagonal metric matrix D: 


, A ‘ 
<xX,xX > = x'Cy—>x Dy. 


To study the different metrices of a vector space we thus have to study 
the different diagonal metric matrices D. The diagonal elements of D 
are called the eigen values of all the symmetric matrices C that can be 
transformed to D by an orthogonal transformation. Thus, to study 
the different metrices we have to study only the eigen values of the 
metric matrices. 


4 — Cybernetic method... 49 


CHAPTER I 


For a vector x having the components ay,,q,,..,a, we get, using 
the diagonal representation of the metric matrix: 


(10) <x,x> = dja? + daa? + ..+ d,a%. 


Since the squares aj, .. ,a% are positive unless they are zero, the whole 
expression < x,x > is positive for all x #0 if, and only if all the eigen 
values d,,d,,..,d, are positive. Accordingly, the metric defined by 
a matrix C is positive-definite if, and only if the eigen values of C are 
all positive. In a similar way we observe that the metric is negative- 
definite, if all the eigen values of the metric matrix are negative, and 
indefinite, if there are both positive and negative eigen valucs, or if 
some of the eigen values is zero. 

As a particular case of positive-definite metric we have the case 
where D = J, or all the eigen values of the metric matrix are 1. This 
is the case of the Euclidean metric. An Euclidean metric is invariant 
with respect to orthogonal transformations: 


<xy>= x'Iy 4, x'A'IAy =x A'Ay=x'yp= <x,y>, 


when A’A =] or when A is orthogonal. 

A symmetric matrix C all of whose eigen values are positive is called 
a positive-definite matrix. Such a matrix has some remarkable formal 
properties. Since its eigen values are all positive we can construct from 
the eigen values the matrix 


fd, 0 0 0 
D*? = 0 Vd, 0 0 
000... Va, 


Defining the matrix S by S = AD”, where A is the orthogonal matrix 
which transforms C to D, and using the decomposition 


D = A'CA or C=ADA’' we get: 
(11) C = SS’, S'S =D 


The equation SS’ = C means, when written for the elements cy, 
of C and the elements s, of S, that 


Ck = > S jiSxi- 
/ 


In words: the elements of a positive-definite matrix C can always be 
expressed in a product sum form. 
From the inequality of Schwartz, |< x,y >| < |x|-|)|, we get another 


50 


CHAPTER I 


property of the elements of a positive-definite matrix C. Applying 
this inequality to the unit vectors e@,,é@,,..,e, we get: 


< €),€, > = le’; Ce,|? = Of S << ee) > Ceg,ey > = Cy Ck: 
This implies that 
cy >O and ch < cycy, for all j, kK = 1,..,n 


holds good for the elements of a positive-definite matrix C. 

Let us return to the general case where C is a symmetric, not necessarily 
positive-definite matrix. By the theorem mentioned above it can be 
transformed, by an orthogonal transformation A, to a diagonal matrix D: 


A'CA = D, where A’A = AA’ =]. 


Its eigen values, or the diagonal elements of D, can be given different 
interpretations in connection with different mathematical problems. 
First, from A’CA = D and AA’ =I we get AA’'CA =CA = AD. 


Writing the 7 columns of A as the » column vectors u;,U, .. ,Up, 
a} Qy2 Qi, 
fies A2y dig a tine Qn 
ant ane any, 


we get from CA = AD the equations 
Cu, = dyu,, Cuy = daa, .., Cu, = d,u,- 
Accordingly, the eigen values d,,d.,..,d, of C are the m solutions 
of the equation 
(12) Cu = du, wherewe W,,. 

This is called the eigen value equation of the matrix C. To each eigen 
value d, there corresponds an eigen vector u,. If two eigen values are 
equal, d, = d,, all the elements of the 2-dimensional vector space 
spanned by uw, and uy, are eigen vectors belonging to this 2-fold ’degene- 
rate’ eigen value. For a j-fold degeneration we get of course a j-dimen- 


sional space of eigen vectors. 
Secondly, from the eigen value equation we get 


(C—rNDu = 0. 


If the matrix C—aIJ is non-singular, that is, if its determinant is not 
zero, the inverse matrix (C—a/)“ exists. Then we get 


(C—aAl)u = 0O—->(C—alTI)"1 (C—r Du = 0 Iu = Ou = 0. 


31 


CHAPTER I 


Accordingly, if |C—v | # O the eigen value equation has only the trivial 
solution u = 0. To get the non-trivial solutions we thus have to put 
(13) |C—xl| = 0. 


This gives the characteristic equation 


Cy Cis. Sei eais-eee nas Ci. 
Cie Cops hs FES Con 50) 
Cin Caan scaomentired aioe Can —> 
of the symmetric matrix C. Thus we get the eigen values d,,cd., .. ,d, 


also as the solutions of the characteristic equation for the unknown 2. 
It is a polynomial of the degree n in 2. 

Obviously, we can consider the eigen values of a nonsymmetric 
matrix C too, defining them by the eigen value equation or by the char- 
acteristic equation. 


4 § Real Functions 


We have so far been discussing notions which all, in one way or another, 
are related with the function space F(A,R), that is, with functions from 
a basic set A to the set R of real numbers. In this way we came to the 
notion of vector space and to the linear operations in a vector space 
for which matrix operations were constructed. 

We shall now confine our scope further, and study the particular 
case of a function space F(A,R), where even the basic set A 1s either 
a subset of the set R of real numbers or of a cartesian product 
RxXRx..xXR=R'" of the set R of real numbers with itself. The 
elements 


fe F(A,R), where Ac R’, 


are called the real functions of n variables defined in A. 
To each sequence x;,X2,..,x, of real numbers from the set A such 
a function f associates one and only one real number y: 


SXX0, ++ X,) = YER, (%4,Xo,.., x, EA. 


The real functions can be analyzed in more detail by means of differential 
operations, the first of which is derivation. 


52 


CHAPTER I 


1 / Derivation 


Let us begin with the real functions of one variable, that is, with functions 
f obeying 
fe F(A,R), where ACR. 


Such a function f associates with every real number x e A one and only 
one real number y: 


I(x) =yeR,x eA. 


The function f is called continuous in the ’point’ x, if the difference 
f(x+ Ax)—f(x— 3x) approaches zero when the real number Ax 
approaches zero: 


f(x + Ax)— f(x — Ax) — 0 when Ax — 0. 


It is continous in an interval (a,b) = {x;a< x <b} CA of real numbers, 
if it is Continuous in each point x e(a,6). 
Denoting 


I(x + Ax)—f(x) = 4f 


we call a continous function f derivable in the point x, if the ratio Af/Ax 
approaches a uniquely determined real number (positive, negative, 
or zero, and finite or infinite), when Ax approaches zero. This real 
number is called the derivative of f in the point x, and it is denoted 
df 
by ay: 
e. = lim a so that ot — of when Ax —0. 
Xx -Re to AX Ax dx 


If we want to indicate the point x, to which the derivative refers we 
d 
may write (-).. or f'(Xp): 


(4), “tim Seto+ 4S) 


dx = f'(%)- 


Ax >0 Ax 

We observe that the derivative of the function fin the point x indicates 
how steeply this function is increasing or decreasing in the point x 
(see Fig. 7). If the function f is increasing in the point xg its derivative 
in this point is positive. If the function is decreasing its derivative is 
negative. And if the function is neither inreasing nor decreasing but 
has a maximum or a minimum in the point x, then its derivative in 
this point is zero. These cases are illustrated in Fig. 7. The figure also 


53 


CHAPTER |! 


x 
Xo Xo 
A real function f which is A real function which has a 
continuous in the point x, but positive derivative in the point x2. 


has no uniquely determined deri- 
vative in this point. 


A real function f which A real function which has a 

has a negative maximum in the point x, and 

derivative in the point xs. a minimum in the point xg. 
Fig. 7. 


illustrates how the derivative of the function fin the point x, determines 
the tangent of the curve y=f(x) in the point xp. 


54 


CHAPTER I 


2 / Derivatives of Elementary Functions 


It is useful to know the derivatives of the so called elementary functions. 
For the power function f(x) = x* where a ia a real number, we get 
at once: 
Af (x+ Ax)*—x? 
AX Ax - 
x* +a: Ax: x® 14 <Ax?> —x? ak 
= e “7 > ax 


when Ax —» 0. Here < Ax?> is a short notation for terms which 
contain at least the second power of Ax. Accordingly, 


dx? 
dx 
Both the power function and its derivative are defined in the whole 
set R of real numbers. 
For the exponential function f(x) = a*, where a is a positive num- 
ber, we get first: 
Af qrt4*_gq* qgo*—1 


Ax Ax ae Ax 


= qx*!, 


We know that a® = 1. Accordingly, for a small positive Ax the difference 
a**—1 is a small positive number (a is always positive). It approaches 
zero when Ax approaches zero. If we write 


qo*—|} ne . 
y 
the variable y so defined approaches infinity when Ax approaches zero, 
and vice versa. Thus by introducing the variable y we can consider 
the limit process y—» o instead of the limit process Ax —> 0. It follows 


from a** = 1+1/y that we have, inversely, 


Ax = log, (1+ ‘) 
y 
Thus the ratio Af/Ax can be expressed as 


a 1 inet 1 


— = —_—_— ° Soy any ‘ 
i y log, (1+<) log, (1+ ;) 
y y 


We have to study what becomes of (1 + 1/y)” when y approaches 
infinity. 


35 


CHAPTER I 


We shall first study what happens when y approaches infinity through 
the integer values n: 


l 
(1+ ;) — ? when n—->o. 


When 7 is a positive integer we get by means of the binomial expansion 
known from school: 


1\" l n(n—1) l 
Pe. Sela & oe . aaa 
n n 


n(n—1)(n—2) 1 
2-3 n3 


] 2 ] ] n—] 
: I 5] |] tet (IH ;) ‘ (17) , 
n n n! n n 
2 
Here we have denoted 1-2-:..:k = k!. The factors ae ]— - 


yt 


! 1) 1 
- $b Gade 9 ( )+5i 


—a 3! 


F ; 1\" 
increase when n increases from which we conclude that (1 + 4) also 
increases when 7 increases. 


On the other hand, (14 ) never reaches 3. Indeed, we sce this 


F 1 
if we observe that every factor lr lea 


pee approaches 1 when 


n approaches infinity. Accordingly, 


x 


l 
31 + git: Whenn > o. 


1\4 ] 
(145) et a a 
n 2! 


But obviously 
ee pa el 
+z, +5,+---< i 5 ae 5 ar > ae ee 
since every term of the right side (except the first term) is larger than 
the corresponding term on the left side. The infinite sum on the right 


side approaches 3 without reaching it by any finite number of terms. 
Indeed, we can follow the sum by starting with the number 2, advancing 


then by adding to it half of the distance (3—2) to get 2 ; ; then add 
again half of the remaining distance (3—2 5) to get 25 ae (5)’ . 


1 
then add half of the remaining distance to get 25 + (5)° “rr (5° , etc. 


56 


CHAPTER I 


: 1\". ; ’ 
Since (1 + 4) increases when n increases but remains always smaller 
than 3, we conclude that the limit 
] a 
lim (1+) =e 
ny» wo n 


exists, and is a number smaller than 3. This number is called the Neperian 
number. It is one of the most important numbers in differential calculus. 
We get from the above series at once a series for the computation of e: 


(14) e=24+ 


By means of this series we can calculate the numerical value of e by 
any accuracy desired. The result is 2,718281828459. . . (e is an irrational 
number). 

It remains to be shown that the limit 


1\ 
lim (1+ 5) =e 
y>s y 


holds good even if y is allowed to approach infinity through real num- 
bers in general, 1.e. without restricting it to the integer values nm. The 
proof is based on the observation that if 2 and n+1 are the two suc- 
cessive integers between which the value of y is, 


n< y<n+I, 


= (oegf 0g)” = ood 


while, on the other hand, 


then 


CHAPTER I 


, 1\y . . 
Since the number (1+ sy" is here between two numbers which both 


approach e when y (and thus 7) approaches infinity, the number (1 + ) 
itself also approaches e. 
Using the number e we get first for the function f(x) = a”: 
AS oe. 
Ax log,e 


when Ax —> 0. 


However, the number log,e means the number x, for which the equation 
ao =e holds good. It follows from this equation that 


| 1 
Xe Xe 


] 
so that les log.a = 1/(log,e). Accordingly we can write 
0 


A 

af —> a log.a when Ax — 0. 

Ax 
The function f(x) = log.x is called the natural logarithm function, 
and the subscript e is then often left off from the notation. Accordingly, 
we can write our final result in the form 


da* 


an a log a. 


The exponent function and its derivative are also defined in the whole 
set R of real numbers. 
To calculate the derivatives of the elementary trigonometric functions 


J(x) = sin x and f(x) = cos x we can use the geometric construction 
shown in Fig. 8. 


Fig. 8. The differential triangle 


58 


CHAPTER I 


This figure shows a "differential rectangular triangle” in which Ax 
is the hypotenuse, while Asin x and Acos x are the cathets. We observe 
that the true ratios Asin x/Ax and Acos x/Ax approach the ratios obtained 
from this differential triangle when Ax approaches zero. Accordingly, 
we get 

Asin x —>» Ax:cos x when Ax—> 0, and 


Acos x — —Ax: sin x when Ax —> 0. 
Thus 


d sin x d cos x 
ae = cosx, —— =— sin x. 
dx i dx 


The trigonometric functions too are defined in the whole set R. 

The derivatives of all elementary functions can be calculated on the 
basis of the above given derivatives, if we know some general rules 
holding for the construction of a derivative. These rules are the following: 

A he 2 O 
dx dx + dx 
d(fe) df dg 


— — .— (derivation of a composite function), 
dx dg dx 
g df 


= ( f(x)g(x)) = f(x) + a. g(x) (derivation of a product) 


df dg 
d f (x) 7 8(x) 5 SO) o 


(derivation of a sum), 


= x 
dx g(x) a(x)? 


df- l ae 
—__ = = (derivation of an inverse), 
dy df 
dx 
where we have denoted f(x) = y so that x = f—(y). These rules could 
be easily derived but it would be a waste of time, since the rules have 
only technical significance. 

To give an example of the application of these rules, let us construct 
the derivative of the logarithm function. Since y = log, x means that 
a” = x, the function f(x) = log,x is the inverse of the function f(x) = a*. 
The logarithm function log,x is defined, obviously, only for non-negative 
real numbers. The inverse of the derivative of the latter is 1/(a*log a) = 
= 1/(y log a). Accordingly, using the last rule above we get 


dlog,y _ ] 
dy y log a 


(derivation of a ratio), 


59 


CHAPTER I 


Of course we could then again substitute the letter x to the place of y, 
and write d log,x/dx = 1/(x log a). 

In particular, for the functions e* and log x we get, since 
log e = 1, the derivatives 


de* ot dlogx 1 
dx 7 dx x! 


If y = sin x, then we write x = arc sin y. Thus the function f(x) = 
= arc sin x is the inverse of the function f(x) = sin x. Since the inverse 
of the derivative of sin x is I/cos x = 1/(l—sin?x)” = 1/(1—y?)”, 
we get 


d arc sin x | P pees 
dx ~ a/T—x? a ac 
In a similar way we get 
d arc cos x 1 P ee 
dx ~  a/ fx? Ae Senex ae 


For the ratio functions 


COS X 


sin x 
tang x = — and cotx = - 
cos x sin x 


we get, using the rule of derivation of a ratio: 


dtangx 1 dcot x _ ] 


’ 


dx cos? x dx sin? x 


And for the inverse functions of tang x and cot x we get 


darctangx 1 d arc cot x l 


> ———_ 


dx 1 +x2 dx cae + x2 


To give an example of the application of the derivation of a composite 
function, let us consider the derivative of the function 


2 
f(x) =e. 
We can consider it as a composite function by writing = 


f(x) = e~ = e”, where y = —x’. 


Thus we get 
Fa YL ay = — 2x". 
dx dy dx 


CHAPTER I 


To give an example of the application of the derivation of a sum, 
let us construct the derivative of a polynomial 


P(x) = dy + a,x + anx? +..+ a,x". 
By the sum rule, and by means of the derivatives of powers we get: 


dP (x) _ 
dx 


a, + 2aox + ..+ na,x""!. 


3 / Analytic Function 


If we are given a continuous function f from A © R to R, and if this 
function has a uniquely determined derivative in every point of A, 
the derivative also defines a function from A to R. Let us denote this 
function as 


lo 

an (x) = y. 

This function is continuous, if = (x + Ax) — a (x—Ax) > 0 
x x 


when Ax —> 0. 


If the ratio A = / Ax approaches a uniquely determined real number 
x 


when Ax approaches zero, this number is the derivative of the function 


of in the point x: 
dx 


df df df df 
A a ae a ame d () 


_———s PS _E _—_> 
Ax Ax dx 


d 
The derivative of the function : is called the second derivative of 


the function f, and it is then denoted as follows: 


a (V 
d’f _ _\dx/} 
dx? dx 
The second derivative too defines a function from R to R, 


—@) = y. 


61 


CHAPTER I 


If this function is continuous it may have a uniquely determined deriv- 
ative. This is called the third derivative of the function fand is denoted by 


‘ (Sa) 
af dx? 
dxs sx 

In this way we can continue, and define the higher order derivatives 
of the given function f. If the function f is continuous and has all the 
derivatives A ’ xs ,-.. Uniquely determined and continuous in an 
interval Ay < A of real numbers, this function is said to be analytic 
in the interval A,. An analytic function f(x) has the important property 
that it can be expanded toa series of the powers (x—a), (x—a)’, 
(x—a)*, ..where a is areal number, if the derivatives of the function 
f in the point x = a are finite. 

Let us consider first a simple example of analytic function for which 
the expansion in power series can be given at once. This casc is the case 
of a polynomial P(x): 

P(x) = dy) + a,x + ax? +..+ a,x". 
We observe that the function P(x) is continuous in the whole set R 
of real numbers, and its all derivatives are uniquely determined and 
continuous also in the whole set R. For the derivatives in the point x 


we get, by applying successively the sum rule and the derivation of 
a power: 


dP(x) = Qa, -+ 2a,x + ee + na,x""", 
dx 
2 
wr) day + 3+ Dat... + n(n—I)agx"-?, 
dx? 
— = n(n—1)(n—2)...1 +a, = n'a,, 


while all the derivatives whose order is greater than n are zero. 
If we take the derivatives of P(x) in the point x = 0 we get the a-coeffi- 
cients of the polynomial, multiplied by the numbers 1, 2!, 3!, . . ., respect- 
ively. Thus we observe that we can write the polynomial in the form 
dP 1 d?P 1 dP ‘ 
P(x) = P(0) + ae) + T qa ©) x72. + = Pr : 
Here we have the expansion of the analytic function P in terms of 
the powers of x—O = x. 


62 , 


CHAPTER I 


We can generalize this to an expansion of P in terms of the powers 
of the general difference x—a by writing x = a+(x—a), and rewriting 
the polynomial and its derivatives in terms of the powers of x—a. 
This leads to the general result 


dP ; 1 d?P 
P(x) = P(a) + dx (a)(x—a) + a1 dx? (a)(x—a)? +...+ 
TT dyn (a)(x—a)”. 

One can show that this can be generalized to apply to every analytic 
function f which has finite derivatives in the point x = a. Accordingly, 
we can write such an analytic function as an infinite power series 

2 


d 
(15) f(x) = fla)+ ef (a)(x—a) + 5 (a)(x—a)? + 


1 df 

~ 31 43 (a)(x—a)? +... 
Such an expansion 1s called the Taylor expansion of the analytic function f, 
and it converges in a neighbourhood of the point a. 

Let us apply this to the exponential function f(x) = e”. It is continuous 
everywhere, and has the uniquely determined derivative e* in the point 
x. But then the derivative is also everywhere continuous and has the 
uniquely determined derivative, the second derivative of the function f, 
which is also e*. Thus every derivative of the function f(x) = e is 
the function f itself so that e* is analytic everywhere. Taking their 
values in the point x = 0 we observe that every derivative of the expo- 
nential function has the value 1 in the point x = 0. Accordingly, the 
Taylor expansion of the exponential function becomes 


1 1 
e=1+x4+ ae + ze ee 


This infinite power series represents the function e* for every real num- 
ber x. 

An equally easy derivation of the Taylor expansion is obtained for 
the trigonometric functions sin x and cos x. Since d sin x/dx = cos x, 


and d cos x/dx = —sin x, and since cos(0) = 1 and sin(0) = 0, we 
get at once: 
l 
sin x = x— 3 8+ Be ; 
I 2 I 6 
cos x = 1—>* + 7 61% +.. 


63 


CHAPTER I 


The power series representing e*, sin x, and cos x indicate obviously 
a connection between these three elementary functions. This connection 
can be made explicit if we introduce the imaginary unit 7, known from 
school. As we remember i was defined as an algebraic element which 
can be added to and multiplied by real numbers, and which itself obeys 


the rule i? = —]. Using this rule we obtain from the above series the 
Euler formula 
(16) e* = cos x+i sin x, 


which we shall need later on. 

Another useful Taylor expansion is that of the function f(x) = (1 +.)’, 
where a is an arbitrary real number. For the derivatives of this function 
we get successively: 


Dei ease when x — 0, 
x 
Oe Nd re I h 0 
qa eK + x)*? — a(a—l) when x — 0, 
d*f a 
ae a(a—1)(a—2)(1+ x)*-3 — a(a—1)(a—2) when x —>0, 
etc. 


Accordingly the Taylor expansion at the point x = 0 is 


a(a—1) a(a—1)(a—2) 

ee — ee 
Here we meet again the binomial coefficients, known from school. 
This infinite series is called the binomial series. 

Since the analytic functions form the most restricted type of functions, 
satisfying all the thinkable requirements in order to be “nice and con- 
tinuous”, mathematical analysis can be carried farthest to the details 
in the very case of analytic functions. On the other hand, the domain 


of applicability of relations holding true only for analytic functions 
is narrow. 


(1 + x)*=1+ax + 


4 / Partial and Total Derivatives 


What has been said above on derivation and analytic functions can 
in a trivial way be generalized to functions f from A<R"” to R. Such 
a function is a function of several variables x,,...,x,: 


F(x1,---,X,) =yeR, (x%,...,X,) =xeA SR". 


64 


CHAPTER I 


The derivative of f with respect to the variable x, can be defined as 
if f were a function of x, only, i.e. keeping the other variables constant 
in the limiting process: 


f _ lea F(X) - + + Xp AX, - » » Xn)— SO) | 
OX; Ax—>0 AX; 
This is called the partial derivative of f with respect to x;. 
It may happen that a function f does contain a variable x, not only 
explicitly, like above, but also implicitly: some of the other variables 


may be functions of x;. In this case the function f has the structure 


Sx Xp. -)s XX. - ye Xp ee MAX.) = ye R. 
Then we may sometimes be interested also in the total derivative of f 
with respect to x,;. This is given by 


Df Cf eX, ef aXe of ef ax, 
(17) | : Se a hod Gnaiee 


Dx; 9 GX, éX; OX_ eX; 


Accordingly, the total derivative is constructed by means of the rule 
of derivation of a composite function, and by the rule of derivation 
of a sum. 

The partial derivatives of second order are defined by 


é 
- 5) we 
ae ( x) 


ie 


In this way we can proceed to partial derivatives of higher order and 
define, for instance, an analytic function of several variables. Such 
a function must have continuous derivatives of all orders with respect 
to all the variables. In a point ae R” where all the derivatives are finite 
an analytic function can be expanded in power series of the differences 
(x,—a,),..,(x,—a,). The Taylor expansion of an analytic function 
of several variables differs from that of an analytic function of one 
variable only in so far as there is a summation over the terms referring 
to different variables x,,..,x, in each power: 


(8) fe) = f+ ZL @a)t...+ Z@%,-a) + 
1 af : 1 ef 
1 ax? (a)(%;—a,)? + 2° 1 ax, ax, (a)(x:—a,)(%2—aq) +... 


5 —— Cybernetic method... 65 


CHAPTER I 


5 / Integral and Integral Function 


Let f be a continuous real function of one variable. Let us consider 
the area left between the curve y = f(x) and the abscissa y = 0 in an 
interval (a,z) along the x-axis (see Fig. 9). This area J/,(z) can be 
approximated by choosing n—l1 points x,,..,x,., inside the interval 
(a,z), and constructing the sum 


S.(z,7) = > f(x) Ax;. 


Here Ax; = x,—x;.,, there being x, =a and x, =z. Obviously 
this sum approaches J,(z) when n approaches infinity in such a way 
that all the intervals Ax, approach zero: 


S,(z,n) — I,(z) when n—» o so that all Ax; — 0. 


The limit sum is called the integral of the function f from the point a 
to the point z, and it is denoted as follows: 


(19) I,(z) = i fix) dx. 


The function J, as a function of z is called the integral function of the 
respective function f. 
For the derivative of the integral function we get: 


1,(z+4z)—I,(z). 


= lim 
dz Az— 0 Az 
But J,(z + Az) = I,(z) + f(z)Az so that 
(20) U2) cific SIZ. = ey. 
dz Az—>0 Az 


n-1 


Fig. 9. For the definition of integral function. 


66 


CHAPTER I 


Thus the derivative of the integral function J, is the function f whose 
integral function J, is. This gives the rule for the calculation of an area 
J,(z) in practice. 

We observe that in order to find the area J,(z) we have first to find 
out a function /(z) whose derivative the given function f(z) is. This 
function contains an arbitrary additional constant, since the derivative 
of a constant is zero. Accordingly, we have 


I(z) os I,(z) ae oF 


where /,(z) is a definite function of z and C is an arbitrary real number. 
Then we have to choose C so that the value J(a) becomes zero: 


C — 4 9(@). 
The desired area is then given by the value /(z) so defined: 
I,(z) = 1,(z)—I,(a). 


For instance, the derivative of the function f(x) = x? is ax} 
for any real number a. Accordingly, the function f(x) = x® itself is 
the derivative of the function J,(x) = x**¥/(a+1): 


d yat! 
; a 


ere ee 
Thus the area J,(1) left between the curve y = x° and the axis y = 0 
in the interval (0,1) is given by 
jot! Q77! 


l 
Loy 4 Wc= ae gael ga 


If we have to calculate an area restricted by two curves y = f(x) 
and y = g(x) we simply form the difference y = f(x)—g(x) and determine 
the area between y = f(x)—g(x) and y = 0. So the method 1s not re- 
stricted to an area between a curvilinear and a linear function but 
can be applied more generally. 

Of course all the methods of constructing the derivative, like the 
derivation of a composite function, sum, product, etc., can be applied 
in an integration problem. This gives for each rule of derivation a 
corresponding rule of integration, We can however, overlook this here. 


67 


CHAPTER I 


6 / Space Integral 


Instead of considering the integral 
T(z) = | f(x) dx 


as an area we can consider it as the "total mass” of the values of the 
function f distributed over the interval (a,z). In this way we observe 
that the integral can be generalized to the case of a real function of 
any finite number of variables. 

For the sake of simplicity, let us consider the case of three real variables. 
If we denote the three real variables by x,y, and z, the domain of f 1s 
the vector space V, where any sequence (x,y,z) gives the coordinates 
of a certain point along three mutually orthogonal coordinate axes. 
Let us consider a closed volume V in this vector space. Let the volume 
V be connected, which means that any two points (x,,),,2,) and (v2, 
YoZ3) Within the volume V can be connected with one another by 
means of a continuous curve. 

Let us divide the volume V into small cubes by means of planes 
parallel to the coordinate planes. Let the number of the cubes be n, 
and their volumes be Ay,,..,Av,. The limit sum 


| foxa.2av = lim ; L(%iY,21) 4% 
1 


a—>>o j= 
Avi—> 0 
if it exists and is finite, is called the integral of the function f over the 
volume V. Here (x;,y;,Z;) is a point within the volume Av,. 

If the volume V fulfills certain conditions of regularity the volume 
integration of a continuous function f over V can be performed by 
means of successive integrations with respect to one of the variables 
x,y, and z in turn. For this purpose let us assume that x, is the minimal 
and x, the maximal x-coordinate of the points within the volume V. 
And let us assume that any straight line in the (xy)-plane, parallel to 
the coordinate axis y, meets the boundary of the projection of V on 
this plane only at two points whose y-coordinates are given by the two 
functions y = y,(x) and y = y,(x), with y(x)2y,(x). And finally, let 
us assume that any straight line parallel to the z-axis meets the boundary 
of the volume V only at two points whose z-coordinates are given by 
the two functions z = 2z,(x,y) and z = 2z,(x,y), there being z2(x,y) 
= z(x,y). 


68 


CHAPTER I! 


IN 


Fig. 10. Integration over a volume 


On these conditions we can perform the integration over the volume 
V as a three-fold integration 
Yilx) zal) 
| S(x.y2)dV = i dx [dy | f(xy,2)dz. 
Yi(x)  21(x%y) 
This means on we seifoen first the integration of f as a function 
of one variable, viz. z, from the value z,(x,y) to the value z,(x,y). As 
a result we get a function F of x and y: 
23(x,¥) 
F(x,y) = | SI (x,y,z)dz. 
21(x,y) 
Then we perform the integration of F(x,y) as a function of one variable, 
viz.y, from the value y,(x) to the value y.(x). This gives us a function 
G of x: 


Ya(*) 
G(x) = | F(x,y)dy. 
Y1(x) 
Finally, we perform the integration of G(x) from x, to x, to get 


J SI (x,y,z)dV = J G(x)dx. 


69 


CHAPTER I 


In order to get simple functions F and G as "integrands” we may 
perform transformations of the variables x,y, and z. If x,y, and z are 
represented as three functions of some new real variables &, 7, and %, 


x= x(E, N» 0) 
y = y¥(&, a, 5), 
z = 2(8, n, 3), 


so that the correspondence (x,y,z) <—> (&, n, ¢) is one-to one, the func- 
tions being continuous and having first partial derivatives, then we 
can transform the integral as follows: 

Ei ris(E) Sa(E.7) 
(21) | fooaey = | dé | dy | S(x(Esn,8), (8,055), 

a ni(E) -0a(E.7) 
0(x,y,Z) ; 
a(E,n,5)| 


Here appears the functional determinant defined by 


, Z(E,n,%)) 


ax ey az 
ag 6 ak ok 
_ jax ay az 
lan an an 
ax ay az 
a ttt 


a(x,y,Z) 
a(é,n,%) 


Let us calculate an example. Let there be 
S(x,y,2) = (X? + y*)z, 
xX; = —l, Xo = + I 
Wy =— V1—x?, v2 = + VY 1—x?, 
2) = 0, Zo a h. 


Accordingly, the volume V is a cylinder of the height 4, whose projection 
on the (xy)-plane is given by the circle x? + y? = 1. 
First we get 


h 
F(x,y) = | (2-+y%)2dz = ; (2+ y2yh?, 


Then we introduce in the (xy)-plane the new variables r and 9 by 


xX =rcos 9, y=rsin 9. 


70 


CHAPTER I 


We have thus 
] 
F(x(r,¢),¥(r,2)) = 5 hr’, 
cose sing 
AXx,y : 
Axe: y) = = rcos*9 + rsinte =r. 
e(r, 9) 
—rsing rcose 


The integration over the circle x?+y? = 1 means integration from 
0 to 1 in the r-variable, and integration from 0 to 2x in the 9-variable. 
Accordingly, 


he ! a an | mh? 
[ (t+y%)2dV = z f rar | dos ee 
| 


If the boundaries of V are given by curves indicating constant values 
of the variables of integration, then the order of successive integrations 
is arbitrary. 

The definition of space integral can in a trivial way be generalized to 
any finite number of variables. 


5 § Topological Notions 


The more advanced analysis of the real functions, and of the complex 
functions closely associated with them, is based on topological notions 
and the notion of complex number. We shall discuss elements of these 
notions briefly in the remaining sections of this chapter. 


1 / Topological Space 


Topology is a generalization from the ’geometrical’ properties of real 
numbers. Representing the real numbers along a straight line, and 
cutting the line at two points, say a and 5b, we have an ’interval’ (a,b) 
of real numbers (see Fig. 11). Since the real numbers form a continuum, 


(a,b) 


See 
Ea a a 


Qg b 


Fig. 11. An interval (@,5) in the geometrical representation of real numbers. 


71 


CHAPTER I 


there is an infinite number of real numbers within each interval (a,b) in 
so far as a¥b. In other words, we cannot make the interval (a,b) so 
small that there would be less than an infinite number of real numbers 
between the numbers a and 3, as far as a 4 b. Omitting the points a 
and 5 themselves from the interval (a,b) we still have an infinite number 
of points in the so obtained ’open interval’ formed by the points between 
a and b. When the end points a and 6 are counted as belonging to the 
interval (a,b) we have a ’closed interval’. The notions of open and 
closed interval are generalized in topology by introducing in a general 
set the operation of closure in the following way. 

Let W be a set, and F(W) the set of its subsets. Let the operation of 
closure, C, be defined as a function from F(W) to F(W’) obcying the 
following rules: 


1° C(C(A)) = C(A) for any set Ac W, 
2° C(A, U As) = C(A,) u C(A.) for any two sets A,,A, < W, and 


3° C({a}) = {a} for any set of W containing only one element ae W. 


A set W where an operation of closure is in this way defined is called 
a topological space. 

In a topological space one uses the following terms: Every clement 
aeW is called a point of the space W. The set C(A)< W 1s called the 
closure of the set A < W. A set A ¢ W which is identical with its closure, 
C(A) = A, is called a closed set. A set A whose complement is a closed 
set, C(W—A) = W—A (the complement W—A of A is the set of all 
the points of W which do not belong to A), is called an open set. Any 
open set A to which a point a belongs, is called a neighbourhood of the 
point a. 

Of course, an example of topological space is the set R of real numbers. 
Any subset A of R is composed of the intervals of the type (a,b) we 
studied above. We can define the closure C(A) as the set obtained 
from A when both the end points a and 5 of all the intervals (a,b), a#b, 
belonging to A, and all the single points (a,b), a = b, belonging to A 
are counted with. Any single point and any set A < R composed of 
mere single points and closed intervals is a closed set. The open sets 
of R are those composed of open intervals. We call this natural topology 
of real numbers the Euclidean topology. 

The Euclidean topology can be obviously defined in every cartesian 
product space Rx RX .. XR = R" composed of the space R of the 
real numbers. This makes every space R” a topological space having 


72 


CHAPTER I! 


the Euclidean topology. In the cartesian product space R” there is a 
point for every sequence (x,,X2,..,X,) Of m real numbers. The corre- 
spondence between the points of R” and such sequencies is one-to-one. 
We can define an addition of any two points 


x = (X1,X2, a Xp) eR” and J Ore 4s Yn) eR” 
by the vector addition rule: 
DA a eo (x, + Vis X2 + Vay - + sXq + yn) eR”. 


We can also define the multiplication of a point x = (X,%2,..,X,) 
e R” by a real number k eR by the scalar multiplication rule: 


kx = (Kx,,kNo, .. ,kKXx,) € R". 


In this way R” becomes an n-dimensional vector space. 

We can introduce to the space R” a metric, whether positive-definite, 
or negative-definite, or indefinite, in an infinite number of different 
ways. We always can, if we like, introduce to R” even the Euclidean 
metric. This is done by defining an inner product between xe R" and 
ye R" by 

ONY D> = XY + Xo t+ -- + Xan 


Then the norm and the distance become Euclidean: 


|x—y| = Voy? (2 — ya)? +.» Ha Pn) 

Therefore, every space R” can be called Euclidezable: an Euclidean 
metric can be always defined in it, if we like, but sometimes we may 
need a non-Euclidean, or even non-positive-definite metric in an Eucli- 
dezable space. What is essentially expressed by saying that the space 
R" is Euclidezable is 1° that it has the Euclidean topology, and 2° 
that, if we like, an Euclidean metric can be defined in it. To distinguish 
R" from the n-dimensional Euclidean space E,, where an Euclidean 
metric is actually defined, we thus call R” an n-dimensional Eucli- 
dezable space. Obviously, every vector space V,, over the field R of real 
numbers is an n-dimensional Euclidezable space. 


2 / Topological Mappings 


Let W, and W, be two topological spaces. Let A be a function from 
W, to W,, and H the corresponding function from F(W,) to F(W,) 
induced by A. Then h& is called a homeomorphism from W, to W,, if 


73 


CHAPTER I 


1° his one-to-one, and 

2° the operation of closure is invariant in H: HoC = CoH or 
(HoC)(A) = A(C(A)) = C(A(A)) = (CoH )(A) for any ACW. 

The homeomorphisms thus are the one-to-one functions that trans- 
form open sets to open sets, and closed sets to closed sets. 

A continuous mapping h is a function from W, to W, obeying the 
following less stringent condition: 

H(C(A)) <C(A(A)) for any ACW,. 


In the case of continuous mapping we can say that if H(A) is a closed 
set, then A is closed, and if H(A) is an open set, then A 1s open. 

Homeomorphism could be defined also as a one-to-one function /, 
when both /: and A define a continuous mapping. The homecomor- 
phisms and the continuous mappings can both be called topological 
mappings. 


3 / Topological Manifolds 


If a neighbourhood A of a point a of a topological space W can be 
homeomorphically mapped to an open set X(A) of an Euclidezable 
space R”, the topological space W is locally Euclidezable at the point a. 
Let x be the homeomorphism from A to X(A). The function x associates 
then to each point 5 of the neighbourhood A n real numbers 


x,(5), x(b), eae »X,(5), 
viz. the coordinates of the point x(b) « R”: 


bs (x,(b), x2(b),.-.X,(b)) = x(b) eR" for every beA. 
These real numbers are called the Jocal coordinates of the point 6 in- 
troduced by the homeomorphism x. 


X 


W RM 
Fig. 12. The local coordinates 


74 


CHAPTER I 


If a topological space W is locally Euclidezable everywhere, with 
respect to a given n-dimensional space R”, and if W can be covered 
by an enumerable union A,u A,U Aju ... of open sets of W, we call W 
an n-dimensional topological manifold, and denote it by M,,. In a topo- 
logical manifold we can introduce everywhere local coordinates (x,(5), 
x(b), .., x,(b)) « R" which are valid for the points 6 in some neigh- 
bourhood A of any fixed point ae W, but we cannot introduce the full 
system of Euclidean coordinates: there need not be one-to-one mapping 
of the whole space W to the whole Euclidezable space R”. An example 
of such a 2-dimensional topological manifold is a sphere. 

Let A,U A.U A3U... be a covering of a topological manifold M, 
by open scts. Let A be one of the sets of the covering, and x be a homeo- 
morphism from A to X(A)cR". If f is a one-to-one function from 
X(A) to YR", then the composite function fo x = y defines a homeo- 
morphism from A to YR”. The two local coordinate systems defined 
in A by x and by y are related to one another by the one-to-one trans- 
formation Ff. 


y(b) = f(x(6) for every be AC M,,. 


Such a function fis called a transformation of coordinates. Let the function 
f be m-times differentiable, (with continuous derivatives, with respect 
to each of its arguments). If we allow the local coordinates of A be defined 
either by the original coordinates x or by any new coordinates obtained 
from the original ones by an m-times differentiable function f, we have 
in the set A a differentiable coordinate system of the class m. Let us 
assume that we have in each set belonging to the covering A,u Anu... 
of M, a differentiable coordinate system of the same class m. 

Let A and B be two arbitrary sets belonging to such a covering of M,. 
If then all the allowed coordinate systems in AB are connected with 
one another by m-times differentiable functions, the manifold M,, 1s 
called a differentiable manifold of the class m. A differentiable manifold 
whose class of differentiability is that of m = o, 1s called analytic 
manifold. 

Recently, a new branch of algebraic topology was created by A. 
Grothendieck. The spaces of Grothendieck are expected to have great 
significance in future topological cybernetics (cf. p. 8), but we shall 
not discuss them here. 


75 


CHAPTER |! 


6 § Complex Numbers 


1 / What are the Complex Numbers for a Science? 


It is easy to understand why the rational and even the real numbers 
in general are helpful in every science in which measurement Is per- 
formed. To speak first of the rational numbers, every result of measur- 
ement is a rational number indicating the proportion of the measured 
magnitude to a unit of measurement. As to the necessity of the real 
numbers in general, we know proportions which can be observed in 
reality but which cannot be expressed in terms of rational numbers. 
Such a proportion is, for instance, the proportion of the diagonal of 
a square to the length of a side of the same square. In addition to the 
rational numbers also the irrational numbers, and thus all the real 
numbers, are needed in science based on observation and measurement. 

In order to see the necessity of the complex numbers in such a 
science we must consider an equation of the form 


(22) a,x" +a,\x"'+..+a,x+a,=0, a, #0. 


Such kind of equations, where the coefficients a@p,..,a, are all real 
numbers, often appear in the analysis of the results obtained ny observ- 
ation and measurement. We may need to solve the equation for the 
unknown x in this analysis. Then we meet a restriction of the real num- 
bers: we may have an equation of the above type which does not have 
any solution x given by a real number. 

An example of the equation of the kind mentioned 1s 


x?+1=0. 


Since this equation means that x? = —1 and since there is no real 
number x such that its square would be equal to —1, the equation 
has no solution by real numbers. 

Here the complex numbers come to help the analyst. One can show 
that every equation of the above type, where the highest power of 
x 1s x", has exactly m solutions given by complex numbers x = ¢,,.., 
x =c,. Accordingly, we can rewrite the equation by means of these 
solutions in the form 


(23) a,(x—C;) ** (x—c,) = 0. 


Some of the solutions c,; may be multiple solutions which means that 
we must take a power (x—c,)“! of the binomial (x—c,) in the left side 


76 


CHAPTER I 


of this equation, k, being a positive integer which shows the multiplicity 
of the solution x =c,;. We call c,,..,c, the ’roots’ of the equation 
in question. 

Of course the complex numbers must again disappear from the 
analysis before we can compare the results with the reality. This may 
happen, for instance, in a way that we have to deal in our analysis only 
with such sums c;+c, or such products c;c, of the roots which happen 
to be real numbers. However, we need knowledge of the roots of the 
real coefficient algebraic equations in order to be able to operate with 
them in our analysis. 

Here, as always, the usefulness of a mathematical tool in science is 
not restricted to the case in which it was first needed but was soon 
extended to new problems. Once we have introduced the complex 
numbers in a science in connection with the real-coefficient equations 
we observe that they are also useful in many other connections. This 
is actually the case. However, we shall restrict the discussion here to 
the role of complex numbers as roots of real-coefficient algebraic 
equations. 


2 / The Algebraic Operations on Complex Numbers 


We know from school that the complex numbers are introduced by 
starting with the equation x?-+ 1 = 0. We enlargen the field R of real 
numbers to a set C which contains as its elements all the combinations 
of the form 

c=a-+t ib. 


Here a and 6 are real numbers while / is an element of C defined by 
i= f—t. 
So we have i? = —l, and likewise (—i)? = —1, so that i and —/ are 


the roots of the equation x*+ 1 =0. 
Let us now consider the set 


C= {c =a+ib;aeR, be R,i = a/ —I}. 


We can represent the elements of this set by the points in a coordinate 
plane (xy) where a is the x-coordinate and 5b the y-coordinate of the 
point c. We observe at once that when a and 3 are allowed to go through 
all real values the point c goes through all the points of the coordinate 
plane. Obviously, the correspondence between the complex numbers c 
and the points of the plane R? is one-to-one. Accordingly, the operations 


77 


CHAPTER I 


on complex numbers can be studied on this coordinate plane, remember- 
ing, that with the y-coordinates we must associate the imaginary unit 
i as factor in order to get complex numbers. 

First we observe that the plane (xy) is a two-dimensional vector 
space. Thus the addition of vectors as well as the scalar multiplication 
of vectors can be performed on the points c. We get simply 


Cy + Co = (@, + a2) + i(6, + 5,), and 
kc = ka + ikb, 
respectively. The former operation defines an addition for the complex 


numbers, and the latter defines a multiplication of the complex numbers 
by a real number k. 


To define a multiplication for complex numbers we can introduce 
the coordinates (r,p) instead of the original (x,y) by writing 


a=rcos 9, b=rsin9 


for each point c (See Fig. 13). 


y 


Fig. 13. The complex plane 


Then we can write 
(24) c=a+ib =r(cosp+isin 9) = re’?, 


where we have applied the Euler equation (see p. 64). Now the multi- 
plication of complex numbers can be defined simply by 


Thus two complex numbers are multiplied with one another by multiply- 
ing their ’absolute values’ r, and r,, and adding up simultaneously their 
’arguments’ 9, and 9». 


78 


CHAPTER I 


It is easy to verify also that the multiplication just defined obeys 
associativity, commutativity, and distributivity with respect to addition. 
We also observe at once that each complex number c has the inverse 


col =: p7le7!?, 
A further operation is defined for complex numbers, viz. the complex 


conjugation. This is a function from C to C itself, which associates with 
each complex number c its ’complex conjugate’ 


c* = q—ib = re!?. 


This operation is important because for each c the sum c+ c* and the 
product cc* are real numbers: 


(25) c-+-c* =2a, cc* =r? = 0. 


Complex conjugation means geometrically a reflection of the points 
c with respect to the x-axis so that c and c* are symmetrically with 
respect to this axis. 


3 / The Roots of Algebraic Equations 


When we know that every algebraic equation 
a,x" --- 2... + aX +d, =0, a, #0, 


has exactly 7 solutions, or roots, given by some complex numbers x= ¢,, 

.,X = C,, we can easily derive either these roots themselves or at 
least some of the properties they have. We restrict ourselves in the 
following only to the case of real coefficients dg,.., a,. 

The above equation expresses the equality of two (in general) complex 
numbers, viz. the number a,x" + ..—+a,x-+ a) and the number 0. 
Taking the complex conjugate of both of these numbers we have another 
equality, since the complex conjugates of two identical complex numbers 
are the same: 


a,x*" + ...a,x* + dy = 0. 
In this equation the real coefficients are the same as before, and only 


x 1s replaced by x*. Accordingly, this equation has exactly the same 
solutions as before: 


XY SC wees Pie 


We conclude that our original equation has for each root x = ¢; also 
the root x =c;. Thus the roots c,..,c, may contain single real 
numbers or pairs (c,c*) of mutually conjugate complex numbers. 


79 


CHAPTER I 


Let us now consider our equation in some special cases. For n = | 
we have the equation 


a,x+a,=0, a, #0. 

This equation has one root and it is real: 
x = —a)/a. 

For n =2 we have the equation 
ax? +a,x+ta,=0, a, #0. 


We know that it has exactly two roots, viz. a number c and its conjugate 
c*. Writing the equation by means of these roots it reads 


a,(x—c)(x—c*) = 0. 
Expanding this we get 
a,x*—a,(c + c*)x + a,cc* = 0. 
By comparing the coefficients with the original form of the equation 
we find that 
a, = —a,(ec+c*) and ay = a,cc*. 
Here are two equations for the determination of the two roots c and c*. 


We get 
Qa; 
2a> : 


c+c* =2a =—a,/a, so that a = — and 


cc* =r? = apap. 
Since r? = a? + b? we have the further result 
a a? 4a, a,—a? 
Rafa —— = + 
Qa, 4a; 4a‘ 
The two roots are thus given by 


a, x Va?—4a, a, 


ie ~ Qay 2a, 

For the equation beginning with the power x? we at once obtain 
the result that it has either one real root and two mutually conjugated 
complex roots, or then all its roots are real. These are obviously the 
only possibilities to accord with the general rule that for each root c 
the complex conjugate c* is also a root. This fact can be used in the 
solution of such an equation. In a similar way we can use our general 
knowledge on the roots of algebraic equations in the solution of equat- 
ions containing higher powers of x. 


80 


CHAPTER I 


We may also use the possibility to represent the complex number 
c in the form re’? when solving algebraic equations. As an example 


let us consider the equation 


a he ed Qo; 


where a, is a real number, either positive or negative. If c = re’? has 
to be a root we must have 

r"e'"? — r"(cosno + isin ng) = apo. 
Since a, is real this gives two equations, viz. 

r"coSn@ =a) and sinno =0. 


The latter gives 


or 
x 2m a (2n—1)* 


nl M1 M1 “ 
as possible different solutions for 9. For these values of » we have 
only two possible values of cos m9, viz. 


9 


cosno = +1 fee. 2 
noon n 
cos7o = —1 if o = ee eyes 
non n 
b 


Fig. 14. The roots of x” = ag in the complex plane. 


6 — Cybernetic method... 81 


CHAPTER I 


If ao iS positive we obviously have to choose the upper sequence of 
n values for 9. If a) is negative we have to choose the lower 7 values 
for 9. These values together with the value 


n 
r= 
V (ec 
complete the solution of the equation x” = ap. The solutions can be 


represented graphically in the complex plane: they are points localed 
symmetrically along the circle whose center is the origin and whose 
n 


radius is [a9] . 


82 


CHAPTER IT: 


Fundamental Cybernetic Notions 


Il § A Glance at the 
History of Cybernetics 


1 / Descartes 


The analogy between animal and machine was first pointed out by 
René Descartes, the well-known 17th century French philosopher. 

Descartes saw the animal as a machine which is in interaction with 
its environment. The environment acts on the animal through the 
stimuli S which are received by the animal. The animal acts on the 
environment through its reactions R by which the animal responds 
to the stimuli. Thus a notion which is much employed in behavioral 
science, viz. the stimulus-response connection S—R, comes from Des- 
cartes. So also does the term reflex. According to him a reaction R to 
a stimulus S comes from the animal like a reflexion from a mirror. 
He had thus a fully deterministic conception of the behaviour of the 
animal. 

As is well known Descartes did not apply the analogy between animal 
and machine to man. At that time when religion still controlled intell- 
lectual life it would indeed have been too daring an enterprise to 
relate the entities of human consciousness with material things. Descartes 
gave to the thoughts, the emotions and the will of the human being 
an independent existence, and so landed in a dualistic conception of 
reality. 

It 1s paradoxical that a philosopher who introduced the first cybernetic 
conceptions became generally known as a philosopher of dualism. 
In fact this is really unjust: Descartes did not "invent” dualism — he 
just gave a formulation to the philosophical ideas which were typical 


83 


CHAPTER II 


and common in the European thinking in that century. In the atmosphere 
saturated by religion it had hardly been possible to extend the mate- 
rialistic conceptions of cybernetics to the entities of human consciousness. 


2 / Pavlov 


The posthumous reputation of many a great innovator is often narrow 
in an unjust way. Ivan Pavlov, for instance, is generally known for his 
experimental work with reflexes, while the great theoretical framework 
created by Pavlov has been less known outside of Soviet Union. How- 
ever, there are good reasons to see in Pavlov the founder of cybernetic 
thinking. 

The main part of Pavlov’s scientific work concerned the analogy 
between animal and machine in a way which now would be called 
cybernetics. One part of this work was, of course, the study of reflexes. 
One can easily understand the interest in reflexes in the works of the 
early pioneers of cybernetics: these are the simplest and most obvious 
mechanisms in the behavior of animal which act like machines. We 
know that Pavlov developed the theory of reflexes. He shoved that 
except for the unconditioned reflexes which need not and cannot be 
learned there are conditioned reflexes which are formed by learning. 
In addition to the ’first signal system’ associated with elementary 
conditioned reflexes there is the "second signal system’ associated, 
for instance, with the learning of language. 

However, Pavlov also developed fundamental notions related to the 
cybernetic analogy between animals and machines. He was the first 
to approach the study of living organisms from a system theoretical 
point of view. The notion of ’system’ is loaned from theoretical physics. 
There we have both ’closed systems’ and ’open systems’. A physical 
system composed of material things is called closed if it is not in inter- 
action (in exchange of energy) with the environment. The system is 
open, if there is interaction between the system and the environment. 

In the theoretical framework of Pavlov the reflexes commit inter- 
action between a living organism and the environment. Accordingly, 
an organism must be characterized as an open system. Now comes 
the decivise question: what is then the fundamental behavioral difference 
between a living organism and a lifeless open physical system? Pavlov was 
the first who answered the question by introducing the central notion of 
cybernetics: the notion to which one has referred by the terms se/f- 
regulation (this was the term used by Pavlov), or self-control, or self- 


84 


CHAPTER II 


steering. Pavlov? stated as early as 1917 that the behaviour of a living 
organism is distinguished from all the other open material systems by 
its highly developed ability of self-regulation, or self-steering. 

Thus one can take the notion of self-steering as a Pavlovian notion. 
The introduction of the notion of self-steering to the analysis of living 
organisms was the beginning of cybernetic thinking in the modern 
sense of the word. 


3 / Wiener 


When the analogy between animal and machine was further developed 
it was ever more obvious that the notion of self-steering, introduced 
by Pavlov, was to be essential in the analogy. The study of self-steering 
material systems is the foundation of the science which is in our time 
called cybernetics, according to the well known book of Norbert 
Wiener’. 

The scientific work of Wiener has been helpful in co-ordinating a 
great variety of different approaches under a common title and in an 
integrated science. When co-ordinating the various approaches the 
significance of feedback as an essential condition of self-steering was 
revealed better than ever before. So one can say that the indication 
of the general role of feedback couplings in all the self-steering systems 
was one of the most important contributions of Wiener to cybernetic 
theory. 


4 / McCulloch and von Neumann 


The early developments in the theory of automata were one of the 
approaches from which cybernetics emerged. M.A. Turing had as 
early as 1936 formulated an interesting thesis associated with automata. 
The most important contributions to cybernetics from the theory of 
automata perhaps came from two scientists, both of whom are disting- 
uished by their large scale of scientific interest and ability. I am referring 
here to Warren McCulloch, a mathematician who became a professor 
of psychiatry, and Johann von Neumann, one of the great mathematicians 
of our century. 


2. Ivan Pavlov, Selected Works. Moscow (no year of printing). 
3. N. Wiener, Cybernetics, 1. ed. 1948, 2. ed. New York 1961. 


85 


CHAPTER II 


In particular the materialistic investigation of living beings in terms 
of cybernetic notions was developed by McCulloch and von Neumann. 
McCulloch, together with the mathematician Pitts, proved in 1943 
the equivalence of neural nets and finite automata‘. The significance 
of this result is very great in the understanding of the nervous system. 
McCulloch and Pitts also developed interesting ideas on the materialistic 
interpretation of universals in cybernetic terms®. One can also mention 
an interpretation of the will given by McCulloch’®. 

As an outstanding mathematician Johann von Neumann gave a 
valuable contribution to the discussion of the possibilities of cybernetic 
theory in the explanation of creative intellectual performance’. Like 
another famous mathematician, A.N. Kolmogorov, he recommended 
an open point of view according to which it would not be wise to see 
any principal restrictions in these possibilities. Von Neumann also 
suggested a cybernetic interpretation of self-reproduction’. 

The many technological contributions of von Neumann, may be 
passed by in this short review of the development of the main ideas 
of cybernetics. 


5 / Oskar Lange 


If Pavlov was the first to formulate the idea of purposive self-steering, 
Wiener the one who directed general attention to the significance of 
feedback as a causal basis of self-steering, and if McCulloch and von 
Neumann have contributed most to the cybernetic analysis of the 
phenomena of life, then Oskar Lange must be mentioned as the social 
scientist who has contributed very much to the introduction of cyber- 
netic method in social science. 

Lange emphasized the significance of cybernetic method in connection 
with two general problems, viz. that of ’wholes’ and that of dialectical 
development®. Lange also gave an elegant mathematical formulation 
for the connection between feedback and self-steering. 


4. W. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous 
activity. Bull. Math. Biophysics 5, s. 115—133, 1943. 

5. W. Pitts and W. McCulloch, How we know universals — the perception of 
auditory and visual forms. Bull Math. Biophysics 9, s. 124—147, 1947. 

6. W. McCulloch, Why the mind is in the head? In Jeffress (Ed.), Cerebral Mecha- 
nisms of Behavior, Hafner Co 1951, 2. Ed. 1967. 


86 


CHAPTER II 


To conclude this short review of the development of cybernetic 
ideas it should be mentioned that there are of course a lot of important 
works of a technological or mathematical nature which we have passed 
by above. The technical construction of many an ingenious apparatus 
has contributed to the understanding of cybernetics. The mathematical 
work done in the theory of differential equations and in the study of 
ergodic processes in operational analysis and elsewhere has also contri- 
buted to cybernetic theory. 


2 $ System Notions 


1 / General Systems 


Let V,,..., V, be any 1 sets of objects of reality. Any subset S of the 
cartesian product V,« .. xV,, 


(1) SEV Seo SV e 


n 


defines a gencral system. The sets V,,..,V,, are called the system objects, 
or the objects of the system S. 

The above definition of general system is very general indeed. It 
identifies the notion of system and the notion of mathematical relation 
between 7 objects of reality. Indeed, a subset S is composed of some 
of the sequences (x,,..,X,), where x, is an element of V,, x, an element 
of V, ete.: 


S = {ys Xl Kee Vise Xp eV pS: 


Such a subset of V,;x .. XV, was called — as can be seen on page 
34, an n-member mathematical relation. 

The definition of general system does not contain any specification 
of the nature of the objects of reality concerned. Neither does it contain 


7. See, for instance, J. von Neumann, The general and logical theory of automata. 
In L.A. Jeffress (Ed.), Cerebral Mechanisms in Behaviour, 1. ed. 1950, 2 ed. Hafner 
Co. 1967. 

8. J. von Neumann, Theory of Self-Reproducing Automata. London 1966. 

9. Oscar Lange, Wholes and Parts, Pergamon Press 1965 (the Polish original in 
Warsaw 1962), and 

Oscar Lange, Theory of Reproduction and Accumulation, Pergamon Press 1969 
(the Polish original in 1965). 


87 


CHAPTER II 


any guarantee of the significance of the relation S: whether it expresses 
a complete, sufficient or insufficient characterization of the mutual 
relations of the objects in question for some particular purpose or 
not, whether the relation is invariant over some interval of time or not, 
etc. 

For instance, the set V, could be composed of the national account 
statistics in Austria in 1954, containing as elements the national income 
of Austria in that year, the number of people between 15 and 64 in 
Austria in that year, the number of tons of wheat produced in Austria 
in 1954, etc. The set V, could be composed of the properties of a 1963 
Citroén model, containing as elements the number of persons for which 
the model is drawn up, the number of litres of petrol the car uses per 
one hundred kilometres, etc. The set V, could contain as elements 
the distance of the moon from the earth on September 21, 1947, the 
number of sunspots observed on March 3, 1921, and other astronomical 
observations referring to various days and years. Still any subset S 
composed of the sequences (x,,..,x,) iS a general system. 

We can see that the definition of general system has very little 
to give to the analysis of reality. Just a mathematical relation between 
some objects of reality has no significance whatsoever, unless the objects 
to which it refers form a kind of reasonable whole localized in space 
and time, and unless the relation itself has some properties of invariance 
and significance. We meet here the fact which we have met earlier 
already: generalization for the sake of generalization itself is not meaning- 
ful in mathematical analysis. It is the specification which gives sense 
to mathematical relations. 

In other words: if we would define our object of study as a general 
system, and call it ’system’ then one and the same system would contain 
elements from very different kinds of material objects, and the same 
material object would involve an infinite number of different *systems’ 
(cf. W. R. Ashby, An Introduction to Cybernetics, London 1956, p. 39). 
Here we want to give a more definite content to the notion of system. 


2 / Material Systems 


We specify general systems first by requiring that the system objects 
Vi, V2,-., V, belong to a material object M localized in space and time. 
The volume of space where the object M is localized may be a connected 
piece of space, as it is if M is a certain chair in a certain room, or a 
certain car driving in a certain street, or a certain man having a certain 


88 


CHAPTER II 


name, or a certain country, etc. Or the volume where M is localized 
may be a disconnected one, i.e. composed of several connected pieces. 
In this case we say that M is a collection of material objects 
M,,M.,..,M,, each of which is localized in a connected volume 
of space. Examples of a material collection M are the four chairs located 
in a certain room, or a certain ten cars driving in a certain street, or 
a group of people sitting around a certain table, or a population of 
all the citizens in a given country in a given year, etc. 

The localization of a material object, or of a material collection, 
is not rigid. A chair can be moved to another room and it is still the 
same chair. A car driving in the street may drive to another city, or to 
another country, and it is still the same car. The citizens of a given 
country may travel through the world without losing their citizenship. 
Thus we cannot require any fixed location: the only thing we can do 
is to require that the object M to which all the system objects V,,..,V,, 
refer is a real material object located in some connected or disconnected 
volume of space at every moment of some interval of time. 

At every moment of some interval of time? Indeed it is meaningless 
to extend the consideration to any point of time whatsoever. A material 
object may be transformed to something which does not interest us 
anymore in our study. For instance, a material thing used for some 
particular purpose, like a chair, may be broken and cast away. If a 
material object is a living being it dies some day, and is transformed 
to a collection of unliving material objects which has other properties 
than the living being had, and thus is characterized by other kind of 
relations than the living being was. If the relation R of our system 
has to express something characteristic of the material object M, then 
we must obviously restrict somehow the interval of time to which 
our system refers. In other words, we must require that M is localized 
not only in space but also in time. 

We already referred to a requirement being imposed on the relation R, 
defining the general system (former denoted by S) which we are studying. 
In order that our notion of system be helpful in the analysis of reality 
the relation R must obviously give a sufficient characterization of the 
properties of the material object M. This condition is important even 
though it is difficult to say in any general terms which characterization 
is sufficient in each case. But it is obvious that we must attempt at a 
sufficient characterization of all the relevant properties of M when 
we are constructing the relation R, if the notion of system has to have 
any significance at all in the analysis of reality. Except that R must 


89 


CHAPTER II 


give a sufficient characterization, relevant to the purpose of the study, 
this relation must also be invariant over some interval of time in order 
to have significance in the analysis of reality. Thus we have two requir- 
ements to be imposed on the basic relation of the system: sufficient 
characterization of M and invariance over some interval of time. We 
call such a relation R a characteristic relation of the material object M, 
and indicate this by writing 


R=ChM. 


We may sometimes consider systems whose material object Af or 
some of its elements are capable of reproducing themselves. Such a 
material object is, for instance, a living organism, or the human popula- 
tion of a given country. We must then specify in the definition of our 
system whether the descendants of M are to be included in MM or not. 
The solution depends on whether the system relation R is a characteristic 
relation of the descendants too or not. For instance, a human being 
and his descendants are individually different in many essential aspects. 
It is hardly meaningful to treat them but as different material objects. 
On the other hand, the people from which the population of a country 
is composed are coupled with one another by social institutions which 
in the main may remain the same from one generation to another. 
Even though social revolutions may change the structure of institutions 
radically, the social institutions between social revolutions at least 
may be invariant enough so that it is meaningsful to consider them, 
during some time as a socio-economic system undergoing changes but 
referring to the same material object. This object contains the population 
of the country in question (and, in addition to this, for instance, the 
tools of labour in this country). In this case it is meaningful to include 
the descendants of M to the material object M of our system. 

We can now define a material system S as a combination 


(2) S =(M,R) where R =ChM. 


In words: a material system is a combination of a material object M@ 
and a relation R such that R is a characteristic relation of M. 

Here R alone requires to define a general system. R is a subset of 
the cartesian product V,x .. x V, in which V,,.., V, are the system 
objects. The system objects all refer here to the same material object M. 

M may be located in space either 1n a connected or a disconnected 
volume, and this localization is not fixed in either case to a given closed 
volume of space. If the localization is disconnected we may speak 
sometimes of a material collection M instead of a material object M. 


90 


CHAPTER II 


M is always localized in a certain interval of time. M may or may not 
include its own descendants: this must be specified separately in each 
case. 

R must give a sufficient characterization of all the relevant properties 
of M, and it must also be invariant over some interval of time. 


3 / Systems of Definite Topology 


A material object may or may not have a material boundary surface. In- 
deed we can consider as a material object, in the physical sense of the 
word, any piece of matter, i.e. any collection of molecules. This collec- 
tion of molecules may contain molecules of different substances, of 
different gases or liquids or solid substances, etc. We may consider, for 
instance, all the matter within a certain volume of space. The surface of 
this volume may cut a chair, and close within the volume a part of this 
chair together with a part of the air around the chair. All the molecules 
within the part of the chair which is included in the volume, together 
with all the molecules of the air included in the volume form the material 
object we are considering. Such a material object, composed of a part 
of the chair and of a part of the air, has no material boundary 
surface. 

However, there are in reality many material boundary surfaces which 
separate in a natural way material objects from other material objects. 
A chair, for instance, has a material boundary surface, and water in 
a glass has a material boundary surface. The skin forms the material 
boundary surface of man. 

From now on we restrict the consideration to such material systems 
whose material object M, has a material boundary surface. Accordingly, 
we are studying a system S = (™M,,R) where R = Ch M,. If M, fills 
a connected volume in space, this boundary surface is connected. If 
M, is a collection of objects, and fills a disconnected volume in 
space, its boundary surface is composed of several distinct parts. For 
instance, the material boundary surface of the population of a given 
country is composed of all the skins (and other material surfaces) of 
all the people of that country. Just as the localization of a material 
object is not fixed to a given closed volume of space, the boundary 
surface of an object is not fixed to any rigid position in space. Of course 
the boundary surface of a man moves and is deformed all the time 
the man is moving. The boundary surface of the population of a certain 
country may even develop new disconnected parts: this happens when 


91 


CHAPTER II 


new people are being born, and an infant is separated from the mother’s 
body. 

The existence of boundary surface means that the material system 
in question has a definite topological structure in space-time. We can 
call such material systems systems of definite topology. More exactly, 

we understand by a system of definite topology a material system 
having a material boundary surface, connected or disconnected, 
through which the interaction of the system with the environment, 
as well as the mutual interaction of the distinct parts of the system, 
is localized in space and time. 
The topological structure of systems can be expected to become an 
important part of cybernetic theory in future, topological cybernetics 
(cf. p. 8). Irrespectively of this, a definite topological structure in 
space-time must be considered as an important defining attribute of 
all the systems discussed in cybernetic theory, whether elementary 
or not. 

Material systems which have a material boundary surface can be 
divided into two classes, i.e. to those systems whose boundary surface 
is completely isolating, and to those whose boundary surface is only 
relatively isolating. The former systems are completely isolated by their 
surface from the rest of the material world, there being no interaction 
whatsoever between the system and the environment. Such systems are 
in physics called closed systems. An example is a collection of gas mo- 
lecules closed within a rigid box which cannot be deformed mechanically 
or otherwise and whose walls do not let in or out any form of energy 
(heat, or electricity, etc.) If M, is such a material object, the system 
objects V,,..,V, are properties of the collection of molecules within 
the box, for instance, temperature, pressure, entropy, specific gravity, 
colour etc. of the gas. The system relation R expresses the physical 
laws which are valid between temperature, pressure etc. in such a gas. 

There is a very important law which is valid for all closed systems, viz. 

The Law of Entropy: In a closed system the physical entropy 

increases monotonically, i.e. all the structural and functional 

organization within the system disappears in the course of time. 
We shall later on study the exact definition of physical entropy. 
Now a general characterization of what happens when entropy increases 
will suffice. 

For a gas closed in an isolating box the increase of entropy means 
the following. Whatever be the original distribution of molecules over 
the volume, and whatever be the original distribution of the velocities 


92 


CHAPTER II 


of the molecules, these distributions tend to a homogeneous one when 
the system is isolated from the rest of the world. In other words, the 
system tends toward an end where each component of the gas is homo- 
geneously distributed over the volume of the box, and the molecules 
of each component are moving by the same velocity homogeneously 
in all directions. For a liquid or a solid state body which forms a closed 
system the Law of Entropy means that all the spatial structures formed 
by the molecules, for instance, the crystals, disappear and the matter 
within the closed space assumes a gas-like state having the properties 
of homogencity mentioned above. 

Of course a closed system is an idealization which is never exactly 
realized in nature. Most material systems which have a material boundary 
surface are only relatively isolated from the rest of the world. They 
are ‘open systems’ of a particular kind, 1.e. relatively isolated systems. 
The term was introduced by the Polish cybernetician Greniewski?®. 


A closed or completely A relatively isolated 
isolated system system 


Fig. 15. Material systems having a (connected) material boundary surface 


For instance, a collection of gas molecules closed within a box becomes 
a relatively isolated system if the walls of the box can be pressed so 
that the volume of the box changes, or if the walls let heat or electricity 
or other forms of energy in and out. If M, is this kind of collection 
of molecules, then the system objects V,,.., V,, representing the pro- 
perties of the gas like its temperature, volume, pressure, entropy etc., 
may indicate the influences of the environment on the system and the 


10. Henryk Greniewski, Cybernetics Without Mathematics, Pergamon Press 1960. 


CHAPTER II 


influences of the system on the environment through the walls of the 
box. The system relation R indicates laws holding between temperature, 
volume, pressure etc. under such conditions. 
For a relatively isolated system the Law of Entropy is not necessarily 
valid. In other words, 
the structural and functional organization may increase in a 
relatively isolated system. 
This is a fundamental property of relatively isolated systems, and we 
shall later on study in detail how the organization indeed may t!ncrease 
in the particular kind of relatively isolated system which we shall call 
cybernetic. 


4 / Cybernetic Systems 


As pointed out by Oskar Lange?!, ’dialectical materialism asserts the 
existence of material systems, the elements of which are linked by 
a chain of cause-and-effect relations”. 

It is thus in accord with the general program of dialectical materialism 
to assume material systems whose interaction with the environment 
obeys a cause-and-effect relation. Obviously such systems are always 
relatively isolated. The specification of cause and effect in the interaction 
of a relatively isolated system with the environment gives us the notion 
of cybernetic system. This specification can be done as follows. 

Let (M,,R) be a material system which has a relatively isolating 
material boundary surface. We specify the system relation RCV,x.. 
x V,, by dividing the system objects to two factors, 


X=V,x..XV,, and 
Y = Vin X ee xX V,,, 


(3) 


where V,,..,V,, are the cause objects and V,,,,,..,V, are the 
effect objects. 

The distinction between the cause and effect objects cannot be per- 
formed on any formal grounds but must be based on the recognition 
of the real conditions under which the interaction of the system with 
the environment happens. The causes V,,.., V,, indicate the influence 
of the environment on the system, while the effects V,,44,..,Vn 
indicate the influence of the system on the environment. 


11. Oskar Lange, Wholes and Parts, p. 1, Pergamon Press 1965. 


94 


CHAPTER II 


When the system objects V,,..,V, of a relatively isolated system 
have been divided into causes and effects we say that cybernetic causality 
is explained for the system. Such a system will be called cybernetic in 
this book. In such a system the system relation can be expressed as 
a relation between the total cause X and the total effect Y: 


(4) R.c XX. 


When we speak of a causal relation in the cybernetic sense we refer 
to such a relation R, in this book. A cybernetic system can now be 
defined as a combination 


(5) S = (M,,R.) where R, = Ch Mg. 


The relation R, represents the fundamental causal recursion defined 
in the cybernetic system S. 

For instance, a collection of gas molecules closed in a box which 
has relatively isolating walls becomes a cybernetic system as soon as 
we distinguish the system objects, which indicate the influence of the 
environment on the system, from those objects by which the system 
acts on the environment. If we press the box, then the change AV of 
the volume of the box indicates the influence of the environment on 
the system, while the response of the system to the environment can 
be transmitted, for instance, by means of the resulting increase Ap 
In gas pressure or by the increase AT of gas temperature. Thus we 
would have in this case 


X = {V} and Y = {p}x {T}, 


where {V}, {p}, and {7} are the sets of the possible values (’states”) 
of volume, pressure, and temperature, respectively. The relation R, 
between XY and Y would in this case be (for an ideal gas) 


pV 
na = const., 


when written as the rule by which the elements of R,< Xx Y are chosen 
(for the expression of a relation by such a rule, see p. 33). 

On the other hand, if we bring heat into the system the resulting 
increase AT of gas temperature would now indicate the influence of 
the environment on the system. The subsequent increase AV of gas 
volume and the increase Ap of gas pressure would now transmit the 
system’s response to the environment. Accordingly, in this case we 


95 


CHAPTER II 


would have X = {T} and Y = {p} x {V}. We observe that the distinction 
between cause and effect cannot be performed on any formal grounds 
but only on the basis of the analysis of the real sequence of events in 
each particular case. The formal rule expressing the relation R, would 
still have the same form, viz. pV/T = const. 

The elements of X will be called the input states of the system, and 
the elements of Y the output states of the system. Using these terms 
the causal interaction of a cybernetic system with the environment can 
be described in the following way. The environment acts on the system 

*by inducing in it certain states of a strictly defined type such as 
temperature, pressure, electric charge, feeling, sense impression” 
(Lange, ibid., p. 4) — the combinations of these states are the 
input states of the system. 

On the other hand, the system acts on the environment 

“by assuming certain states of a strictly defined character, e.g. 
temperature, magnetic field, colour, generation of sounds, motion” 
(Lange, ibid., p. 4) — the combinations of these states are the 
output states of the system. 

We can indicate the division of the system objects V,,.., V, to the 
input objects V,,..,V,, (the cause objects) and the output objects Vin4i,- +5 
V,, (the effect objects) graphically by letting the former be represented 
by channels” coming to the system S, and the latter by “channels” 
departing from the system S (see Fig. 16). 


V Vt 
: < S ae 
V < awa.“ 
mM n 

Fig. 16. The division of the system objects in a cybernetic system 


In a cybernetic system we can always — if we like — define even 
a third kind of state concept, viz. the inner states of the system. This 
definition is based on the following theorem: 

A relation R,< XX Y can always be parametrized by a set P= {f;; 
se} of functions from X to Y, i.e. expressed as the union 


(6) Re= Us Ss. 

Slint for proof: Take an element y,eY for which the cardinality of 
the set {(x,y); x eX, Yoe Y,Yo fixed} in the relation R is maximal. Intro- 
duce the index set £) by writing {(x,y); xeX} = {(x,,¥o); se Zo}. 
Obviously this index set gives the minimal index set for the paramet- 


96 


CHAPTER II 


rization P, and a number of parametrizations can be easily constructed 
by using this index set. 

The indices s in a parametrization PD of R, are called the inner states 
of the cybernetic system in question. Different parametrizations thus 
define different sets of inner states for the system. The function 


(7) f,iX —> Y 
is the state function, or the behavior function of the system in the inner 
state s. It indicates how the system in the inner state s responds to 
a given stimulus, i.e. what is the output of the system for a given input: 
f(x) =y, xeX, yey. 
To sum up, we have defined the cybernetic system by using the follow- 
ing tree of successive definitions: 


1 — general systems 


“ON 


referring to no 


uniqucly determined 2—material systems 
material object Pa x, 
no material 3—having a material 
boundary boundary surface 
completely 4— relatively 
isolated isolated 
causality 5—causality 
not explained explained 


cybernetic 
7 — Cybernetic method... systems 


CHAPTER II 


I: general system is a relation RC V,x .. x V, where the system objects 
V,,..,V, are sets of real objects, 

2: material systems are general systems whose system objects refer 
to a uniquely determined material object M whose characteristic 
relation the system relation is: R = Ch M, 

21:a uniquely determined material object is a definite collection of 
matter localized in space and time, 

211: material object is considered localized if it is within a certain 
volume of space at every moment of time within a given 
interval of time, whereby 
2111: the volume of localization may be either connected, 

or composed of several mutually disconnected parts, and 
211: the surface of: the volume of localization is not fixed 
but can move in the course of time, 

212: it is to be decided in each case separately whether the descend- 
ants of a material object, capable of reproducing itself, are 
to be included in the system or not, 

22: the system relation is a characteristic relation of the material 
object M if 

221: it gives a sufficient characterization of the properties of M 
for the particular purpose of the study of the system, and 

222: is invariant over a sufficient long interval of time to be useful 
in the study, 

3:a material system has a material boundary surface if its surface is 
composed of material elements different from the neighboring elements 
inside and outside the system so that a// the energetic impulses from 
the environment or from the system do not penetrate the boundary; 
a material system having a boundary surface, connected or dis- 
connected, defined in space-time and indicating the movements, the 
descendants etc. of the system and of its parts, was said to have 
a definite topology in space-time, 

4: a material system M, having a definite topology is relatively isolated 
if there is some form of interaction between the system and the 
environment through the boundary surface, expressed in terms 
of the system objects Ven V igs 

5: cybernetic causality is explained for a relatively isolated system if 
the system objects V,,..,V,, are divided into causes and effects, 
i.e. objects by which the environment acts on the system and by which 
the system responds to the environment: this is not a formal distinction 
but one which must be based on the analysis of the real sequence 


98 


CHAPTER II 


of events in each particular case; if the system is composed of several 

disconnected parts (cf.211), then the mutual interaction of the parts 

is not considered as an interaction between the system and the environ- 

ment but as an interaction occuring within the system itself. 

In the context of a cybernetic system we use certain terms which 
were defined as follows: 

input states: the elements of X¥ = V,x ..V,,, where V,,.., V,, are 
the system objects representing the causes, 

output states: the elements of Y=V,,,,x .. XV,, where V,43,.., 
V,, are the system objects representing the effects, 

causal relation: the system relation of a cybernetic system when 
expressed as R.c Xx Y, 

inner states: the elements of £, when P = {f,; se Z} gives a para- 


metrization of the system relation R, so that R, = U te; 
se 


state function: each of the functions f,, determining the outputs 
of the system in the inner state s by the equation f(x) = y, for x eX, 
ye Y; the state function can also be called the behavior function. 

It is not the notion of general system but the notion of cybernetic 
system which is fruitful in the analysis of reality. In short, a cybernetic 
system is both material, topological, and causal. 


5 / Explicit Introduction of Time in a Cybernetic System 


Let it be still emphasized that the causality of which we are speaking 
cannot be defined by any formal process but must be concluded on the 
basis of an analysis of the sequence of real events in each case. This 
iS important to observe, since one often in West tends to define causality 
as a formal time-ordering. 

To emphasize that causality in the sense of dialectical materialism 
is not merely a formal time-ordering of events on whatever arbitrarily 
chosen ground, time was not at all explicitly introduced in the above 
consideration. We spoke of the system objects V,,.., V, and of their 
division to the causes V,,..,V,, and the effects V,,,,,..,V, without 
mentioning time. Of course the notion of time was implicitly involved 
in our consideration in so far as what we call a cause always precedes 
in time what we call effect. Let us now consider the explicit introduc- 
tion of time to a system. 

Let us begin by introducing time to a general system RCV,X .. XV,,. 
We associate first with such a system a calendar K defined as a subset 


99 


CHAPTER II 


of the set T of all the moments of time: K<7. Each general system 
thus has its calendar, i.e. a particular set of the moments of time. The 
.set T of all time points can here be mapped in a one-to-one way 
to the set R of all real numbers: for each real number there is one point 
of time, the real number in question being the indicator of time in 
some unit of time. Accordingly, the calendar K of a general system R 
can always be mapped in a one-to-one way to a set composed of real 
numbers. These real numbers may form a continuum or they may form 
a discrete, or even finite set. We say that the system has then a continuum, 
discrete, or finite calendar, respectively. 

We define the interval K}’ in the calendar K as the set of the moments 
t’’ of time which obey t<?t’’<t’ and belong to K: 


Kp SAS tee, eK. 


In particular, K; = {t} indicates a single time point rf. 

We are now able to introduce time explicitly to the system objects 
V,,..,V,. Let us introduce n abstract sets, viz. A,,..,A,, one for 
each system object. Let v, be a function from K to the set A;. Let us 
denote by v,|K/ the restriction of the function v, to the interval K;. 
We call such a restriction the segment of the function y; associated with 
the interval K;’. In particular, the restriction v,|K; is an element 
of A; when associated with the point ¢ of time. We call it the »omentar)' 


value of the function y, at the time point f, and denote it by 1,(7): 
v(t) = v,|K}. 


Now we define the system object V; as the set of all the segments v|Ki 
composed of all the functions v, from K to 4A;: 


Vi = F{v|Ky 5 viK — A), Kj'¢ K}. 


By this we have defined V, as the set of all the function segments 
that can be composed by restricting all the functions from K to 4; 
to all the intervals K/’ < K. The elements of the system objects V;,.., V,, 
thus are such segments of functions. The elements of different system 
objects are here by no means restricted to refer to one and the same 
interval K}’: they may well refer to different intervals. 

When we consider a cybernetic system S = (M,,R,) instead of a 
general system R we must introduce time so that all the cause objects 
refer to the same interval of time, the same being true of all the effect 
objects. This can be performed in the following way. Let x be a function 
from the calendar K of the system to the cartesian product A,X .. XA», 
where A,,..,A,, are the abstract sets associated with the cause objects 


100 


CHAPTER II 


V,,.., Vm Of the system. Let x|K;’ be the restriction of x to the interval 
Ki’. The set X of all the possible input states of the system is then 
defined in a time-explicit form by 


(8) X = {x|Ki; x: KA\X .. XAn, KCK}. 


Thus the set of all the possible input states is the set of all the function 
segments which can be composed by restricting all the functions x 
from K to A,x.. XA,, to all the segments K/ of K. 

The time-explicit form of the set Y of all the possible output states 
can be defined in a similar way: 


(9) Y = {y|Ki3 yt K— Ana X ©. X Ag, KS K}. 


However, the explicit introduction of time to a cybernetic system 
is not completed hereby. We must still specify the time dependence 
of the cause-and-effect chains contained in the causal relation R, © X~x Y. 

The cause-and-effect chains in a cybernetic system are based on the 
analysis of the sequence of events which occur in material reality. Of 
course the effect comes in reality always after the cause. This is the 
starting point in the explicit introduction of time to the causal relation 
R.. 

Let us consider an effect associated with a definite moment of time, 
1.¢. the momentary output 


y(t) = y|Kt. 
If we denote by K‘ the part of the calendar which contains all the calendar 
time before t, I.e. 

K' = {t'; t’<1,t'e K}, 
then the cause of y(t) certainly appeared during the interval K‘. This 
is all what we can say as a general rule on the explicit introduction 
of time to the causal relation R,<cX~x Y. 

If we make some assumptions on the state descriptions = of our 
system we can say more. For this purpose let us consider a system 
S‘ =(M,,R‘) obtained from the original system S = (M,,R.) by 
restricting the original calendar K to K‘. Let X‘ and Y‘ be the corre- 
sponding restrictions of the sets X and Y of the input and output states. 
In the restricted system the causal relation is given by 


RicX'x ¥¢. 
Let us introduce a parametrization P‘ of Ri, and let us denote 
the corresponding state description by 


x! = {s(t)}. 


101 


CHAPTER II 


When we let f run through the calendar K, we get a set {S';t«K} of 
restricted systems, and a set {'; te K} of state descriptions referring 
to different points of time. 

If the state descriptions referring to the different points of time 
are such that the state s(t’) is uniquely determined by the state s(t) 
at an earlier time ¢ and by the input segment x|K/, 


s(t’) = g(s(t), x|K7), 
and the momentary output y(t) is uniquely determined by the inner 
state s(t), 

y(t) =f'(s(t)) , 
the cybernetic system S is called state-determined. The function g 1s 


the state-transition function of the system S, and the function f” is its 
output function. 


Using the state-transition function we can rewrite the output function 
as follows: 


y(t’) = f(s’) = f'(g(st), x|Ki)) = f(s(t), x|Ky). 
It is usually given in the latter form. Then the state-transition function 
g and the output function f are both defined in the cartesian product 
x'xX! where X{’ is composed of the restrictions of all the x’s to 
the interval K{’ of calendar. We can then write.: 
(10) gi ix Xf > xr (state-transition), 

fi UxXxXt—> yY* (output function). 

Let it be added that the dependence of the functions g and f on the 
segment x|Ky may imply both the dependence on the values x(t) 
of the function x within the interval K{ and the dependence on the 
end points ¢ and ¢’ of this interval. Thus the state-transition and the 
output functions may be given by 

s(t’) = g(s(t), {x(=)}, 40’) where te K}, 
y(t’) =f(s(t), {x(s)}, 10°) where te KY. 
In this case the system S is called ’non-time-invariant’. 
If there is no explicit dependence on ¢ and 1’, i.e. if 
s(t’) = g(s(t), {x(s)}) where te Kt, 
y(t") = fis(t), {x(s)}) where te KY, 
then the system S is called time-invariant. We shall meet later examples 


of both time invariant and non-time-invariant cybernetic systems. 
In the above formulae it was assumed that t’>+¢, in which case the 


102 


CHAPTER II 


state-transition and the output functions express the state and the 
output at the time ¢’ in terms of the state and the inputs associated with 
earlier points of time. The state description of the system is then called 
non-anticipatory. Obviously, we can consider the state-transition and 
the output functions in a non-anticipatory state description as a time- 
explicit expression of causality in a state-determined cybernetic system. 

However, one should emphasize that the existence of cybernetic 
causality has nothing to do with the state description of the system. 
The existence of causality concerns the distinction of the system objects 
V,,.., V, to input and output objects, and is based on the real sequence 
of events occuring in material reality. A cybernetic, and thus causal, 
system is completely defined even without any introduction of inner 
states at all. The introduction of the inner states is a purely formal 
procedure, and so also is the choice of a non-anticipatory state descrip- 
tion. 

One can sometimes also introduce an anticipatory state description to 
cybernetic system. The state description is called anticipatory, if it 
associates some outputs y(t’) with inputs x(t) occurring at a later point 
of time, t> +’. Such a case occurs in a state-determined system if the 
state-transition and the output relations remain formally valid even 
for ¢'<t. We shall give later examples of both non-anticipatory and 
anticipatory state descriptions in a (causal) cybernetic system?*. 


6 / Digital Systems 


The explicit introduction of time to a cybernetic system, as explained 
above, consisted of 


(1) the introduction of a calendar K, of 

(2) the definition of the input and output states x « X,ye Y, as func- 
tion segments in this calendar, and of 

(3) the specification of time in the cause-and-effect chains of the 
system. 


12. It should be mentioned that in mathematical systems theory it has been custo- 
mary to introduce several notions of ‘causality’, some of which are defined by 
a purely formal process. For instance, M.D. Mesarovic (Mathematical theory 
of general systems and some economic problems, in Kuhn-Szegé (Ed.), Mathe- 
matical Systems Theory and Economics, Springer-Verlaf 1969) distinguishes between 
external’, ’internal’, and ‘time causality’. However, the two latter are defined 
purely formally and thus do not represent causality as a category explaining 
material reality. The first category corresponds to the cybernetic causality as 
explained here. 


103 


CHAPTER II 


As an example of this procedure we shall now consider the explicit 
introduction of time to a particular kind of cybernetic systems, viz. 
the digital systems. By a digital system we understand a cybernetic 
system S = (M,, R.) where all the interaction between the system 
and the environment as well as between all the different parts of 
the system happens in the form of short, countable impulses. Exam- 
ples of such impulses are the beats of heart and the nervous impulses 
exchanged between the neurons in a nervous system. It is essential 
that we can distinguish the successive impulses from one another, 
count them, and associate each of them with a definite moment of time. 
Accordingly, the calendar of a digital system is composed of an enumer- 
ative set of discrete points of time. We can thus always represent the 
calendar K of a digital system by a set of successive integers, 


K = {t, tot], to +2, ..., tn}. 


Here f,, may or may not be + o. 

Once the calendar of a digital system is introduced we can easily 
complete the introduction of time to such a system. An interval 
Ki ={t";t<t'"<t', te K} now reduces to the set 


Kf ={t, t+1, 142, ..,t' I} cK 


of successive points of time. An input segment x|Ky thus reduces 
to the set 


x|Ki = {x(1), x(t +1), ..,x(t’—1)} 
of momentary inputs. In a similar way an output segment y|Ky reduces 
to the set 


y|Kr = (¢), Wt 1), --, xe’—1)} 


of momentary outputs. The set X of all input states is defined, as before, 
as the set of all the possible input segments: 


X = {x|Kf; x: Kf + AX .. XAm, Ky ¢ K}. 
In a similar way the set Y of all output states is defined as the set of 
all the possible output segments: 

Y = {y|Ki; y: Ki > AmngiX..XA,, Ki K}. 


We will remember that A,,.., A,, are the sets of objects associated 
with the system objects V,,..,V,, (the cause or input objects), and 
Am+41»++,A, are the sets of objects associated with the system objects 
Vingts++9V, (the effect or output objects). These sets may be any 
sets of real objects whatever. 


104 


CHAPTER I! 


Let us now consider a state-determined digital system. The conditions 
of state-determinacy read now, when written for two successive points 
of time, ¢ and t’ = ¢+ 1, as follows: 


s(t +1) =g(s(t), x|K{*) =g (s(t), x(t), 1) (state-transition), 
y(t +1) = f(s), x]Ki*2) =f (s(4), x(t), 1) (output function), 
for a non-time-invariant system, and 

s(t+ 1) = g(s(t), x]Ki*?) = (s(t), x(t) (state-transition), 
yt +1) = f(s), x]Ki7) =f (s(t), x(t) (output function), 


for a time-invariant system. A time-invariant state-determined digital 
system is called an automaton in the customary sense. However, we 
could also speak of non-time-invariant automata as automata which 
are capable of changing itself. We shall meet such kinds of changing 
automata later. 

In particular, a time-invariant state-determined digital system is 
called a finite automaton, if the system objects A,,.., A, are all finite 
sets. Let the number of elements in the sets A,,..,A, be j,,..,/,, 
respectively. Then the number of elements in the total cause object 
A,X .. XA, iS Il,..1, = N, and the number of elements in the 
total effect object Anyi .. XAq IS Imgilmge++ ln = Ne: 


2) 


+: A, x o An a ee =N,, 
tf Anyi -. XA, = m+tim +2 1, = N,. 


Both N, and N, are finite numbers. Accordingly, each input function x 
may have only AN, different values, and each output function y only N, 
different values. These values may be numbered by the integers 1,2, ..,N, 
and 1, 2,..,Ne2, respectively. So we get for the maximal ranges of x 
and y: 

max R, = {X,, X2,--,Xn,}> 


max Ry re {yi Vas-'=9 Yue} 


Accordingly, in a finite automaton the ranges of input and output 
functions are finite sets. 

We can employ the finiteness of the ranges of input and output funct- 
ions in a finite automaton to the binarization of its inputs and outputs. 
By this we mean a decomposition of each possible input value x, and 
of each possible output value y, into a set of input or output components 
each of which can have only two possible values, viz. 0 or 1. This can 
be performed simply by expressing the index k as a binary number 


105 


CHAPTER II 


in terms of the numbers 0 and 1. For instance, from the seven first 
input values x,,.., X7 we get in this way three binary input components 
each of which can have either the value 0 or the value 1 in the following 
way: 


Original input Binary components: 
values: Input I: Input II: Input III: 
Bo 0 0 I 
Xe 0 l 0 
Xs 0 l J 
X4 | 0 0 
Xs l 0 l 
X¢ l l 0 
Xq l l 1 
Thus a momentary input x(t) which can have the seven values x,,..,X7 


can be decomposed into three simultaneous binary inputs x;(¢), 1 (4), 
and xy (t), each of which can only have the value O or the value |. 
We can indicate this graphically by drawing to our system as many 
input channels as there are different components of binary inputs. The 
output can be treated in a similar way to get a finite number of binary 
output channels. 

Let us consider as examples some very simple finite automata where 
the inputs and the outputs have been binarized. The conjunction eiement, 
the disjunction element, the byrocrat, and the multiplier (see Fig. 17) 
are all finite automata which have only one and constant inner state. 
Thus there is no state-transition in these automata, and their output 
functions are represented simply by 


y(te+ 1) =f), 

where the output function f is different for different automata. Repre- 
senting both y(¢ +1) and x(f) in binary form, the functioning of these 
simple automata is completely characterized by indicating the transi- 
tions x (tf) —> y(t + 1) as given in Fig. 17. The binary input or output 
0 can be interpreted to represent the phenomenon ’nothing happens 
in the channel’, while the input or output | indicates an impulse in the 
channel. 

A somewhat more complicated finite automaton is the ”"memory 
element” M which has two inner states s, and s,, and the state-transi- 
tion and output functions given in Fig. 18. 


106 


A Conjunction Element c: 


CHAPTER II 


x(t) > y(t +1): 


00 ———> 0 


C= }> 


01 ——~> 0 
10 ——_~» 0 


11——. 1 


A Disjunction Element d: 


x(t) > y(t +1): 


00 ———» 0 


Li ;-> 


01 ———>- 1 
10 ——>. 1 


11]-——> 1 


A Byrocrat b: 


x(t) — y(t+1): 


ci MM 


A Multiplier x 3: 


0 —— 0 


{——> 1 


x(t) > y(t +1): 


= 


Fig. 17. Four one-state finite automata 


The Memory Element ™: 
s(t), X(t)—> s(t+1): 


S;,00 ——> s, 
s,,01——> s, 
S,, 10 ——>  s, 
S2,00 ——> 5s, 
So,01 ——>  s 
So, 10 ——> 5, 
So, 11——> s, 


Fig. 18. A two-state finite automaton 


0 —— 000 


| ——> l11 


s(t), X()—> y(t4+1): 


s,,00 ——~> 0 
s,,01 ——> 0 
S,, 10 ——> 0 
Sy, 11 ———> 0 
$2,100 0 
So,01 ——> 1 
So, 10 ——> 1 


So, 11——> 1 


CHAPTER II 


From the five types of finite automata represented in Fig. 17 and 
18 we can construct the conditioned reflex automaton shown in 
Fig. 19. The coupling of the elements in this system follows the rule: 
if an element S,; has an output channel which leads to another element 
S; (as one of its input channels), then the output of S; in that channel 
at time f is simultaneously an input of the element S,. The binarized 
inputs and outputs are to be interpreted so that the input or output 
value 0 means ’nothing happens in the channel’ while the input or 
output value 1 means ’an impulse in the channel’. 

We can easily see that the finite automaton of Fig. 19 indeed 
can learn a simple conditioned reflex. For this purpose let us perform 
five experiments with the automaton. 

Experiment 1. Let the memory element M of the automaton be in 
the inner state s,. Let an impulse come to he automaton along the 
input channel UCS at time ¢. At time ¢+1 it is multiplied to two impulses 
in the multiplier x2. One of these impulses dies in the memory element 
at time ¢+2 without changing the inner state s, (to change it one should 
need two simultaneous impulses to M). The other impulse continues 
from the first byrocrat at time +2, and from the second byrocrat 
at time ¢+3, goes through the disjunction element, and is in the reaction 
channel R at time t+-4. Thus, the automaton has performed an uncon- 
ditioned reflex UCS—R. 

Experiment 2. Let the memory element ™ be still in the inner state 
s,. Let an impulse come to the system along the input channel CS at 
time ¢. At time ¢+1 it is multiplied to two impulses in the multiplier. 
One of them again dies in M at time ¢+2 without changing the inner 
state s,. The other continues from the first byrocrat at time ¢-+2 but 
dies at time f+3 in the conjunction element c. There is no reaction 
in the channel R. The automaton could not do the reflex CS—R. 


UCS 


CS 


Fig. 19. A time-invariant conditioned reflex automaton 


108 


CHAPTER II 


Experiment 3. Let the memory element be still in the inner state s,. 
Let two simultaneous impulses come to the system at time f, one along 
the channel UCS and one along the channel CS. Both of them are 
multiplied to two impulses at time t+1 in the respective multipliers. 
At time ¢+1 the memory element thus receives simultaneously two 
impulses which change the inner state: the new inner state is s,. A memory 
trace is thus born as a consequence of reinforcement. At time ¢-+2 
both of the impulses which entered M die, because M can react by an 
output only if it already was beforehand in the inner state s,. The re- 
maining two impulses continue from the first byrocrats at time ¢+2. 
The one which entered the system along the channel CS dies at time 
t+3 in the conjunction element. The other continues at time t+3 from 
the second byrocrat, and at time ¢+4 from the disjunction element, 
and is thus in the channel R at time t+ 4. The automaton has performed 
the unconditioned reflex UCS—R, and learned the conditioned reflex 
CS—R. 

Experiment 4. Let the memory element now be in the inner state s., 
as a result of the conditioning performed in Experiment 3. Let an impulse 
come to the automaton along the channel CS at time ¢. At time t+1 
it is multiplied to two impulses in the multiplier. One of them continues 
from the memory element M at time f+2 transferring this element 
back to the inner state s,. The other impulse continues at time ¢+2 
from the byrocrat. Thus the conjunction element c receives at time 
+2 two impulses, and accordingly sends out an impulse at time +3. 
This impulse continues from the disjunction element, and 1s in the channel 
R at time t+4. The automaton has performed the conditioned reflex 
CS which it learned in Experiment 3. 

Experiment 5. After Experiment 4 the memory element is again in 
the inner state s,. Accordingly it has forgotten the conditioned reflex 
CS—R: a single non-reinforced reaction was sufficient to extinguish 
the conditioned reflex in this automaton. However, a single reinforce- 
ment is also sufficient to teach this reflex to it again: if we now serve 
simultaneously the impulses UCS and CS the memory element again 
transits to the inner state s, and learns the conditioned reflex. 

Of course, the conditioned reflex in the automaton!’ of Fig. 19 
is very simple. This automaton learns and forgets too quickly in order 


13. This automaton was adapted from Henryk Greniewski, Cybernetics without 
mathematics. I have only added a self-consistent calendar. This required the intro- 
duction of the byrocrats. 


109 


CHAPTER II 


to be realistic. It also needs both the conditioned stimulus CS and the 
unconditioned stimulus UCS simultaneously, and not successively as 
a real animal. It is not capable of generalizing the conditioned stimulus 
CS in the same way as a real animal is. However, these more realistic 
features could be easily reached by a more complicated finite automaton. 


7 /| Analog Systems 


In a digital system the interaction between the system and the environ- 
ment, and between the different parts of the system, occurs in the form 
of pulsating phenomena such as the beats of heart or the nervous im- 
pulses. Such pulsating phenomena are rather rare in the (macrophysical) 
material reality, at least outside living organisms. It seems as if nature 
had only with difficulty succeeded in producing digital systems. The 
most typical approaches to a digital system in nature are the nervous 
systems of men and the animals. 

A typical macrophysical causal system is not a digital system: usually 
in non-living nature we cannot distinguish between different impulses 
of interaction, but interaction is more or less continuous in both space 
and time. Such cybernetic systems are called analog systems, for reasons 
that will be explained later. 

A typical (causal) physical system, and thus an analog system, 1s, 
for instance, the system of gas molecules in a closed box which we have 
already discussed before. In such a system for the volume V of the 
gas, the gas pressure p, and the temperature TJ (measured as absolute 
temperature in the Kelvin scale) we have the equation 

pV 

ee const. 
Thus, whatever changes p, V, and T may go through in the course of 
time the product pVT~ is always constant (in an ideal gas). 

Accordingly, if we press (cf. Fig. 20) the box, or the cylinder in 
which the gas is closed, so that the volume is decreased by AV, then 
the gas pressure p and the gas temperature 7 are increased by the re- 
spective amounts Ap and AT so that the changed values again obey 
the above law: 

(p+Ap(V+AV) pV _ 

T+AT ~~ 
By expanding the product we get (p+ Ap)(V+ AV) = pV+pAV-+ VAp + 
+ ApAV. But if AV, Ap, and AT are small we can eliminate here ApAV 


110 


CHAPTER II 


as a second order small magnitude, and get (p+Ap)((V+4AV) = pV+ 
+pAV+VAp = pV+ A(pV). Here we have used the rule of derivation 
of a product according to which A(pV) = pAV+VAp for small Ap 
and AV. When substituting the result for the above formula we get 

Pa ORV ys BE a. 

TLAT T 

Accordingly the changes Ap, AV, and AT obey the law 

pAV+VAp = cAT. 


Fig. 20. An analog system where time is not explicitly introduced: 
the gas in a cylinder 


Now it is evident that when we press the cylinder and thus change 
the volume of the gas by AV, the pressure and the temperature are not 
suddenly changed by the respective amounts Ap and AT determined 
by the above law. The pressure and the temperature are rather increased 
continuously to the values of equilibrium indicated by the law. In fact 
the increase of the pressure p and the temperature T as a consequence 
of the decrease of the volume V are time processes but our consideration 
of the gas system has not explicitly involved time. Still the system ob- 
viously is causal. This is an example which shows that an explicit introduct- 
ion of time is by no means necessary in the study of causality, and thus 
of cybernetic system. Jf time is not explicitly introduced, we only have, 
instead of the laws of time processes, the laws which characterize the 
states of equilibrium to which such processes tend. 

Let us now consider a typical physical analog system where time 
is explicitly introduced, and thus the time processes can be explicitly 
described. Let us consider a mass point having the mass m, and moving 
along the x-axis. The position of the mass point at time ¢ is then indicated 


111 


CHAPTER II 


by its x-coordinate at time ¢: x(t). The velocity of the mass point is 


; nee ee d 
represented by the time-derivative x(t) = = (t), and the acceleration 


, - d? 
by the second derivate x(t) = el (t). Let a time-dependent force 


u(t) act on the mass point. Then the dynamical law which determines 
its motion can be written 


m x(t) = u(t). 
By integrating from f) to ¢ we get on the left side: 


t * dx f 
m [ana = m | (0) dt = m| dx = m (x(t)—x(tp)), 


and on the right side: 


fucs ds. 


Thus for the equations of motion we get the first integrated form 
| . lf 
x (t) = x(t) + PR fuc d= 
fe 
Another integration then yields the final integrated form 


1 c o 
x (t) = Xo + (t—%y) Xo +—| dof u(t) d- 


Here we have denoted x (f)) = x9 and x (tp) = Xo. 

The last equation expresses the position x (t) of the particle at time 
t in terms of the combination (x,, x,) referring to the time point 4, 
in terms of the values of the force function wu in the interval Kio = (fo, C) 
and in terms of the end points ¢ and fy of this interval. Accordingly, 
it can be regarded to define the output function of a non-time-invariant 
state-determined system. In this output function we can represent the 
inner state of the system as a 2-component vector 


w=(), 


Then the output function can be written 


x(Q= (1 te) ste) + — 1) = FG)» HIK',) 


where the functional J(u) is 


t o 


Ku) = | dof vas 


Se 


112 


CHAPTER II 


The state-transition function g is represented in a similar way by 
a combination of the two integrated forms of the equations of motion. 
The state-transition can be written: 


= 1 I (u) 
s= (5 ‘7 ") stot a = £ (s(t) ufK%,). 
d/(u)/d¢ 
Here 


d/ 
of (u) = [ a2) d-. 


Ve 


If we choose ty to be smaller than f, then s(f,) represents the initial 
state of the system, and the input segment ulKi, represents the force 
which acts on the system after the time t, up to the time f when the output 
x(t) 1s given. Thus in this case the solution of the equations of motion 
for x(t) or s(t) Is non-anticipatory: they connect the output x(t) and 
the state s(¢) with a past input and a past inner state. In this case the 
output relation x(t) = f(s (fo), u|K ig) and the state-transition relation 
s(t)=g (S(t), u|K{,) can thus be taken as a formal expression of 
causality. 

However, we can just as well choose t,) greater than ¢, and the 
above solutions of the equations of motion for x(t) and s(t) will still 
be formally valid. In this case the relations x(t) =f (s(to),u|Kj,) and 
s(t) = g (S(to),|K ig )s are anticipatory: they connect the output x(t) 
and the state s(t) with a future input and a future inner state. Of course 
the latter way of representing the solution is not very useful since in 
practice we do not know the future values of physical magnitudes. Still 
such a representation is formally possible. This illustrates the fact 
that the formal description of the explicit time process, whether anti- 
cipatory or non-anticipatory, has nothing to do with the existence of 
causality. The existence of causality is independent of the way in which 
the time process is formally described. The existence of causality de- 
pends only on the true sequence of real events in each case, and it 
can be only analyzed on an informal basis. 

If we know the output function f and the inner state of a cybernetic 
system, we can use this system as a computer which computes the values 
of f (the output) for given inputs. Of course we have to construct the 
system so that its output function does the particular calculation which 
we want to be done. In this way we can replace the arithmetical process 
of calculation by a physical process, or represent the process of cal- 


8 — Cybernetic method... 113 


CHAPTER II 


culation by a physical analogy. Hence the name analog system for 
a non-digital cybernetic system. 

The digital systems can of course also be used as computers. Indeed 
they are the prototypes of modern electronic computers. However, 
in a digital system the computation is not realized by replacing an 
arithmetical process by a physical analogy. In digital computers the 
arithmetical process is preserved: the binarized inputs and outputs 
are interpretable as digits themselves. Hence the name digital. 

Of course the examples of cybernetic systems, both digital and analog, 
represented above were rather trivial. Their purpose was only to give 
simple illustration of the foregoing definitions. 


3 § The Notion 
of Cybernetic Whole 


1 / Wholes and Components 


Let us consider N cybernetic systems S,,.., Sy. Each of them, say 
S,, is a relatively isolated material object, M@;, whose interaction with 
the environment is characterized by a causal relation RZ between 
a set X, of input states and a set Y, of output states: 


S, =(M;,, RL) where Ri = ChM; ¢ X,x Y,. 
The set X, is the cartesian product of some input or cause objects 


X\>++>Xm,» and Y, is the cartesian product of some output or effect 


944 my? 
objects Y;..., Y,,: 
X,=XiX..X Xi» 
Vea Xe XK Ts 
Accordingly, each input state x,e X, of the system S, is a sequence 
of the elements of input objects, and each output state y, « Y, of the 
system S, is a sequence of the elements of output objects: 


X, = (X1,-->%Xm,) © X7> 
Vr = O1>- + Mn) © Y,- 
If the system’s S, being in a given output state y, may somehow 


influence the system’s S, being in a given input state x,, or if the system’s 


114 


CHAPTER II 


S, being in a given output state y, may influence the system’s S, being 
in a given input state x,, we say that the systems S, and S, are coupled 
with one another. If S, is coupled with S,, and S, with S,, we say that 
S, is indirectly coupled with S,. If each of the systems S,,.., Sy is 
either coupled or indirectly coupled with each other, these systems 
together form a cybernetic whole S. Accordingly, 


we define a ’cybernetic whole’ as a material object composed of directly 
or indirectly mutually coupled cybernetic systems. 


Such a structured whole is not necessarily itself a cybernetic system. 
However, it is always a system in the sense of general systems theory 
(cf. p. 87). 

Indeed, cach of the subsystems, or components, say S,, is associated 
with mm,-+-n, system objects, viz. X},.-, Xin, Yi...» Y,,. Thus we 
can associate with the whole S a totality of m,+”,+m,+m.+ ..+ 
+ My -- Ny System objects, viz. 


Deere, Ger rare Garret Gunere) Leer) 4 nmrerer) CAMs abot 


my)? mN? 


The mutual coupling of the components S,,.., Sy already introduces 
a relation between all these system objects. Accordingly, we can consider 
the whole S, composed of the S,,.., Sy, as a general system. 


2 / Organization 


Let us now consider a whole S composed of the cybernetic systems 
S,,.., Sy. For this purpose we may introduce the sets X and Y defi- 
ned by 

MeN NK oe OM Ra RM Ka eA 


YS Va KY ye x OCY Ras YS ee 


Let us call the elements xe X the total input states of the whole S, and 
the elements ye Y the total output states of the whole S. (Notice that 
X and Y are not the sets of input and output states of S in the sense 
we have used these terms in the context of a cybernetic system S). Each 
total input state x and total output state y can thus be represented 
as sequences of the input or output states of the components S,,.., Sy 
in the following way: 


X= (ps MW )= Oe Maye ee Xm) EX 
Y = (Yas - Iw) =e ays Wave Vay) © Y. 


115 


CHAPTER II 


In the present chapter we shall perform a preliminary study of struc- 
tured wholes by considering only its organization and without introducing 
explicitly time. We shall distinguish between functional and structural 
organization. By functional organization we shall understand a particular 
kind of systematics appearing either in the total input x (input organiza- 
tion) or in the total output y (output organization). The study of input 
organization will give us the notion of information, and the study of 
output organization the notion of action. By structural organization 
we shall understand, first, the coupling of the individual outputs y, 
to the individual inputs x, and, secondly, the coupling by which the 
individual inputs x, determine the individual outputs y,. The term 
*individual” was used here to indicate that we were speaking of a single 
component of S. (See the scheme on the following page). 

In Chapter III we shall continue the study of cybernetic wholes. 
We shall study there in more detail the mode of action and the inner 
laws of motion of a cybernetic whole, introducing time explicitly. 


3 / Structural Organization: Cybernetic Coupling 


We want to specify mathematically what is meant by structural organiza- 
tion in a cybernetic whole S composed of the components S,,.., Sx. 
For this purpose we assume that each of the system objects 


1 1 1 1 N N N N 
rar. ite Gren 2 6 rere ora, (ene Cure ae 


my? mj? 


of the general system S forms a variable of whose possible value 1s 
given by a numerical value times a unit which indicates the dimension 
(quality) of the variable in question. Thus we allow the possibility that 
different system objects represent different qualities, and have different 
dimensions and units. However, we assume the system objects to have 
been so composed that all the elements belonging to one and the same 
system object have the same quality and the same unit. Remembering 
that each of the system objects, Y, or Y,, is associated with a certain 
dimension and unit, we can represent the elements x, eX, or ype Vy 
of each system object by the respective numerical values, i.e. by real 
numbers. 

Furthermore, we can make the convention that the numerical value 
x, =O represents the shortage in the quality associated with the 
system object X,;, and y, =O represents the shortage in the quality 
Y,.1fX, (or Y,) is a ’qualitative”, dichotomous variable, the presence 
of this quality may be indicated by the numerical value x, = 1 (or 


116 


CHAPTER II 


uol}9e UOIJBWIOJUI 

poziues3O jO UONnON 

uo1}ezZ1UeZIO u01jeZ1UBTIO 
3ndjno jndu] 


NZ 


uoneziuedlo 
jeuonsun.y 


Neo 


qusuoduwi0s be s}uguoduw05 
ul 3uljdnos 94} U99M}9q 
yndjno-yndu] suljdnod 


NY 


uoNeziuedIO 
yeanjon4ys 


uoleziUuesIO 


117 


CHAPTER II 


y, = 1). If X,g (or Y,) is a quantitative variable, we can represent 
by the numerical values x, >0 and x, <0 (or y,> 0 and ),; <0) the 
respective positive or negative magnitudes of this variable. 

An input state x, of the component S, can then be mathematically 
represented by an m,-component column vector, and an output state 
y, by an n,-component column vector: 


x yy 
x" ys, 
(1 3) 4, = i > yy = i 
nic Vne 


We can now specify mathematically the mutual coupling between 
the components S,,.., Sy in the following way. For each output variable 
Yj; we introduce an output channel (Y;,) for the system S,, and for 
each input variable Xj an input channel (X)) for the system S, (sce 
Fig. 21). If the output yg gives a contribution to the input x) we 
say that there is a coupling form the output channel (Y;) to the input 
channel (X%). The contribution itself, which is of course a real number, 
we shall denote by 


(14) x5! = Chk pe 


The coefficient cjy is a real number which will be called the coupling 
parameter. 


Fig. 21. The coupling from S; to S, 


By different values of the coupling parameter we can indicate different 
kinds of coupling. These may be defined as follows: 


Cj > 0: an excitative or positive coupling, 
cj, < 0: an inhibitory or negative coupling, 


Cj, = 0: no coupling, 


118 


CHAPTER II 


By the absolute value we can indicate the strength of the 


coupling in question: 


rs 
Chk 


ci | > 1: a strengthening coupling, 


< 1: a weakening or damping coupling, 


rs 
C ik 


= 1: a standard coupling. 


CK 

Since the value cjg =O expresses the case where is no coupling 
from the channel (Yz) to the channel (X}), we can apply the descrip- 
tion of coupling by means of coupling parameters to every output 
channel of S, and to every input channel of S,. Thus the total coupling 
from S, to S, is indicated by the m,xn, matrix C,, given by 


rs rs rs 
Cir Oye Cin, 
C,; | a 
rs rs rs 
Cinpl Cin, 2 Cmpns 


This is the coupling matrix from the component S, to the component 
S, of the whole S. 

The contribution x,,, of the total output y, of S, to the total input 
x, of S, can now be expressed as the matrix product 


(15) NrIs = CrsVs « 

By summing the contributions x,,, over the index s from 1 to N we 
get the internal input of the component S,, 1.e. the part of the input 
x, which is due to the influence of the outputs of the other components 
of S: 


; N N 
int 
x, = > Xris a > Cie: 
s=] s=l 


The input x, may also contain contributions coming from the environ- 
ment of the whole S. This part of x, may be called the external input 
of the component S,. Let us denote it by xt". It is of course also 
an m,-component vector. Its components indicate the contributions 
of the environment of S to each of the m, components of the input x,. 
The input x, is the sum of the internal and external inputs: 


N 
(16) tS ee = > Cayce a 


s=l 
If xf**40 the component S, is a receptor of the whlole S. If 
xs*' = 0, S, is a non-receptor component of S. We can also make 
a similar distinction with respect to the output y, of the component S,. 


119 


CHAPTER II 


If the output y, has a channel to the environment of the whole S, i.e. 
if y, influences the environment of S directly, S, may be called an 
effector of the whole S. On the other hand, if the output y, influences 
directly only some other components of S, S, is a non-effector component 
of S,. If all the components S,,.., Sy of the whole S are both non- 
receptors and non-effectors, the whole S is a closed system. AS soon 
as the whole S has both receptors and effectors, it is of course a cybernetic 
system. If S has receptors but no effectors, or effectors but no receptors, 
we can call it semi-closed. Thus a cybernetic whole S may theoretically 
be of three main types: 


A cybernetic whole 


1S 


or 
system system system 


a closed | a semi-closed . a cybernetic 

Let us return to the analysis of coupling. Let us arrange the individual 
coupling matrices C,,, allowing both r and s to run from | to N, into 
a table so that the subscript r indicates the row and the subscript s 
the column in which the matrix C,, is located. This arrangement gives 
us the total coupling matrix C of the whole S: 


(17) CSS hi ba iarea. 5 cera .| =mxX<n matrix. 


Since the individual coupling matrix C,, has m, rows and n, columns, 
the total coupling matrix C has m = m,+m,+..+my rows and 
n=n,+n.+..-+ny columns. The total coupling matrix C gives a 
mathematical representation for the cybernetic output-input couplings 
between the components of a whole. 

Let us then consider the input-output coupling within a given comp- 
onent, say S,. Since S, is a cybernetic system we know that it reacts 
to each input in a given input state x, by an output in a given output 
state y,. The output y, depends on the input x, which in general 
represents an input segment received by the system S,. If we introduce 
time explicitly we can see that in the most general case each momentary 
output y,(t) of the cybernetic system S, represents an accumulation 


120 


CHAPTER II 


of the influences of a set of past momentary inputs. However, we shall 
not introduce time explicitly to our consideration of a structured whole 
yet — it will be done later on in Chapter III. Accordingly, we let here 
the input state x, represent the time segment composed of all those 
momentary past inputs on which the output y, depends. Then we 
indicate the existence of an input-output coupling in a cybernetic sys- 
tem S, by writing 
(18) yr, = T, (%,). 
Here 7, is an action operator which changes the input x, received by 
the system S, to the output y, of the system. 7, may be either uni- 
valued (function) or many-valued: its mathematical structure does 
not interest us in this discussion, where we do not introduce explicitly 
time. (We shall return to this question in Chapter III.) 

By arranging the individual action operators 7,,.., 7) into a matrix 
so that the operators fill the main diagonal while the other elements 
of the matrix are zero we get the total action operator T of the whole S: 


T,0....0 
0 T, 0 
(19) j oe ee eee 
00....Ty 


This operator is a mathematical representation for the input-output 
couplings in all the components S,,.., Sy of the whole S. 

The combination (C, 7) of the total coupling matrix C and the total 
action operator T is a mathematical representation for the structural 
organization or cybernetic coupling in a whole S. 


(x4) (YT) 


—) 


Fig. 22. Qualities represented by channels. 


4 / Qualitative and Quantitative Aspects of System Objects 


For the study of functional organization we must develop the form- 
alism of a cybernetic whole S a bit further: we must consider the qualit- 
ative aspects of the system objects in more detail. 


121 


CHAPTER II 


We will remember that a structured whole S is always a general 
(material) system. We shall use the same formalization of the system 
objects here as in the preceding section. Accordingly, we let each of 
the system objects 


) otra CPEEEE, CVEEEY. CRED CEEEED 6 a heres oh 


my)? n 


of the total system S represent a variable. The elements of the set XY; 
or Y) are represented by the possible values of the respective variable 
Xj, or Yj. Each possible value of the variable X’ or Y) is composed 
of a numerical part, the real number xj or jy), respectively, and of 
a unit indicating the quality or the dimension of the variable. 

To each of the system objects, Xj or Y;, we thus associate a certain 
quality (X45) or (Yj). To each input quality (X') there corresponds 
in the graphical representation of the system S, an “input channel’, 
and to each output quality (Y;) there corresponds an "output channel’ 
(see Fig. 22). 

The different elements of the system object Xj or Y} are then 
distinguished from one another by means of the different values of 
the real number xj or yj which indicates the numerical value of the 
variable Xj or Yj. 

Accordingly, we have distinguished between 


the system objects: 


1 1 N N 1 1 N 
Myre Kime ess Xa ees Mims Vas. ¥ ey Gree 


ny? 


N 
. 9 Vr 9 


n 


the qualities: (X}),..,(Xiy)s-->(XT),--. (XT) Ds + 
(Yi)--(¥),...(77.), and 


the magnitudes: 


1 1 1 1 N 
3 TE OE A CE EE ED 


The component S, has m, input qualities and n, output qualities. 
Let us denote the sets of its input and output qualities by Q, and Q,, 
respectively, so that we can write: 


Q, = {(X1), *- (X,,,)} » mM, = # Q,, 
Q; = {(Y1), ve K ( Ge) n, = + Q;. 


We shall call any subset of Q, a combined input quality of the com- 
ponent S,, and any subset of Q/ a combined output quality of this 
component. The combined input qualities of S, are thus the elements 


122 


CHAPTER II 


of the set F(Q,) of all the subsets of Q,, and the combined output 
qualities are the elements of the set F(Q’) of all the subsets of Q?. 
Thus there are g, = 2”’ combined input qualities, and g} = 2" com- 
bined output qualities in the component S,: 


q, = #F(Q,) = 2™, 
q; = # F(Q;) = 2”. 


The whole S has the input qualities and the output qualities of all 
of its components S,,..,S,. Thus the sets of the input and the output 
qualities of the whole S are given by the unions 


N N 
QV = U Q, and Q’ = U On 
r=] r=] 
respectively. The whole S has m =m,+ ..-+my input qualities and 
n=, + .. + Ny Output qualities: 


m= #O0= #Q0,+..+#970, =m+..+™m, 
n= #Q':2 #Q0/+..++#4# 0, =m t+..t+y. 


Thus the number of the input (or output) qualities of the whole S is 
simply the sin of the input (or output) qualities of its components. 
This is trivial, if we remember that to each input (output) quality there 
corresponds an input (output) channel in the graphical representation: 
the number of all the input (output) channels within the whole S is 
of course the sum of the input (output) channels coming to its com- 
ponents. 

The sum rule does not hold for the combined qualities of the whole S. 
The combined input qualities of the whole S are the subsets of Q, and 
its combined output qualities are the subsets of QO’. Thus their numbers 
are given by 

q + F(Q) = 27 = 240-7" 91 92--4Nn> 
q’ = + F(Q') = 2” = 2"7--*"N = qig’s..Qn- 
Thus the number of the combined input (output) qualities of the whole 
S is the product of the numbers of the combined qualities of its 
components. This is easy to comprehend, since each combined input 
(output) quality corresponds to a set of input (output) channels. 
Turning to the quantitative aspect of the system objects we have 


to consider the magnitudes x},.-,Xmis- +X. - Xm Vee Whee 


>’ my)? 1 


Vasant: Vag each of which is a real number. In the preceding section 
we already combined the input magnitudes x;,..,x,, of the comp- 


) 
I 


123 


CHAPTER Il 


onent S, to an input state vector x,, and the output magnitudes y',.., 
y,, tO an output state vector y,. Now we can continue this construction 
by representing each possible total input state x of the whole S by 
a vector composed of the vectors x,,..,xXwx. In a similar way we can 
let the total output states y of S be represented by the values of a vector 
composed of y,,..,y~-. So we get the total input vector x and the 
total output vector y of the whole S: 


xy Y1 
Xe Jy2 
(20) x=/|-°- and y=|-°: 


XN YN 


The set ¥, of all the input states of the component S, and the set 
Y, of all the output states of S, were given by 
X, =X xX .. xX}, and 
Y,=Y\x..XY, 5 
respectively. Now we shall make the assumption that each component 
system S, has a finite number py, of possible input states, and a finite 
number y, of possible output states: 
Bp= #X, = #(X1X.. XX, )< ©, 
Y= #Y, = #(Y1X..xXY,) < @. 
The set X of all the total input states of the whole S and the set Y 
of all the total output states of S are given by 
X= Xx... XXy, and 


21 
oe Y= Y,x..XYy, 


respectively. Thus the numbers of input and output states of the whole 
S are obtained from the corresponding numbers of the component 
systems by the product rule: 


w= #X = (4X)..(# Xn) = bite-- un, 
+ Y = (#Y,)..(4# Yn) = wve.. w.- 


Vv 


124 


CHAPTER II 


5 / Input Organization: The Notion of Input Information 


We are now ready to discuss the first aspect of functional organization 
in a cybernetic whole, i.e. the organization of total input. We shall 
define the cybernetic notion of input information in terms of the pro- 
perties of total input of a cybernetic whole. 

Let us consider the total input vector 


1 

x} 

1 

bl 

an Jo é a 

xy “4 Xy 
Xo X93 
(22) = ; = 2 , where x, = ; 
Z . r 
VN Xm, 

cots 

Xny 


When a structured whole S is in a given total input state x #0 we say 
that it receives some ‘information’ from this total input state. We shall 
first distinguish between the content and the magnitude of this notion 
called ’information’. 

When speaking of the information content of a given total input 
State X we distinguish between information quality and information 
pattern. Let us first define the information quality Q, of the total input 
state x. 

If the component xj of the vector x is different from zero, 


x, #0, 


we say that the whole S “receives” the input quality (X’)) when the whole 
is in the total input state x. We define the information quality Q, of 
the total input state x as the set of all the input qualities the whole 
S receives when it is in the state x: 


QO. = (Xj); x) #0}. 


125 


CHAPTER II 


In other words, the information quality Q, of the state x is the combined 
input quality which the whole S receives when it is in the state x. The 
set Q, is of course a subset of the set Q of all the input qualities of the 
whole S, and an element of the set F(Q) of all the combined input 
qualities of S: 


Q.-Q, Qe F(Q). 


The set F(Q) hasqg = 2” elements. Thus there are q possible infor- 
mation qualities which the whole S can receive from a total input state. 
We say this: the whole S is able to distinguish between q different 
information qualities. 

Each of the component systems, say S,, has only g, = 2 ’ combined 
input qualities, and thus is able to distinguish only between q, different 
information qualities. The number gq of the information qualities which 
the whole S can distinguish is the product of the information qualities 
which can be distinguished by the components: q = q,q...qy-. This 
is easy to comprehend, since an information quality corresponds to 
the set of those input channels along which a non-vanishing input 
is coming when the system is in a given input state. 

The information quality Q, is an aspect of the information content 
of a total input state x. Another aspect of its information content 1s 
the information pattern P, of the state x. By the information pattern P, 
we understand the distribution of the magnitudes of the non-vanishing 
components of x over the input qualities received by the system S$ 
when it is in the state x. Accordingly, we can define P, as the restriction 
of the vector x to the vector components whose qualities are represented 
in the set Q,: 


P, = x|Q,. 

Let us then consider the quantitative aspects of the information 
received by the whole S from a total input state x. Before the definition 
of the actual magnitude J, of information contained in a given state x 
we shall study the information capacity of the system S. 

Each of the total input states x is a choice among pz possible total 
input states (see the preceding section). We say that the information 
capacity I of the total system S is greater the greater the number pu 
of its possible total input states. To determine the mathematical relation 
between J and p we ask for the number of successive yes-or-no choices 
we have to make in order to choose one among y possibilities. If J has 
to be this number we have: 


27 =p. 


mtr 


126 


CHAPTER II 


Hence: 
(23) I = loge p 
One counts information in terms of yes-or-no choices mainly because 
it is the most natural method of counting in the context of the most 
important systems of nature, the digital systems. 

Each of the component systems, S,, has u, possible input states. 
Thus the information capacity J, of the component S, is given by 


I, — log, HU, - 


Since » — wu ,y... ux, the information capacity of the total system S 
is the sum of the information capacities of the components: 


I = jog, u,+.. + log. uy =i+..+ldy. 


The definition of information capacity takes into account only the 
number of the possible total input states from which a given total 
input state x 1s a choice. It does not take into account any previous 
experience the system may have. To define the information magnitude 
I, received by the system S when it is in a given total input state x 
we have to take into account somehow the earlier experience that the 
system S has about the distribution of the total input states. This is 
because if, for instance, the system S has been all the past time in one 
and the same total input state x, the mere continuation of the state x 
does not give the system anything we want to call ’information’. 

We want to define the magnitude J, of information contained in 
the state x in terms of the ‘unexpectedness’ of the state x. To do this 
we can express the earlier experience of the system S of the distribution 
of the total input states by associating a probability p(x) to each possible 
total input state x. The magnitude of information, J,, has to be so 
defined that it is greater the greater the magnitude 1/p(x). In accordance 
with the above choice of unit we put: 


l 


(24) I, = log, 7a) 


If we consider the total system S during an interval of time which 
is long enough to bring forth the statistical frequencies f(x) ~ p(x) of 
the total input states x, we get, for the average information I,, received 
by the system S during this interval, the formula 


] 
(25) Tay = pS PUxX)Ix = 2, PRoBs 5 ° 


xeX 


127 


CHAPTER II 


I,y iS non-negative and has its maximum value when all the probabi- 
lities are equal, i.e. when p(x) = 1/u for all xe X¥. This maximum value 
is I: 

0</],, < I = logs u. 


This explains the term "information capacity” we used above for the 
magnitude J. The average information J,, is sometimes called informa- 
tion entropy (sometimes this entropy is defined with the opposite sign). 

Each of the component systems, S,, receives from a given total 
input state x the an I(r) of information given by 


L(t) = Jos 


Here p (x,) is the probability of the input state x, of the component 
S,. For the average information /,,(r) received by the component 
system S, we thus get: 


Ir)= Dd p(x) Lr) = 2 px) log, — 
xreX, xreXr p(x x,) 
This satisfies also the respective condition with respect to the information 
capacity: 
= I(r) < <i], = loge By - 

Let us study the relation between information magnitudes received 
by the total system S and by its components. The probability p(x) 1s 
of course the probability of the co-occurrence of all the component 
input states x,,x2,..,Xy which together form the vector vw: 

P(X) = P(x1X%.. Xn). 
We define the conditioned probability p (x,| x12. . Xx—-1) by 
P(XiXa. Xe) 
P(X1Xq. - X-1) 


Then we can expand the probability p(x) as follows: 


P (x4 | x12 + Xa) = 


P(x) = p(%) P (X2|*;) P (Xa| 1X2). -- p (xy | xiX2- .XN_1) « 
Accordingly, we get 


I, = log, ey =1e(1) + L,(2|1) + LG] 12) + 
(26) +..+1,(N|12..(N—1)) 
Nay = Tal) + L(2] 1) +... + Fa(N[ 12. .(N—1)), 


128 


CHAPTER II 


where we have used the notations 
L(A\12. (k—1)) = log, ——-—- >» 
: | : - P (x4 |x1%2- - Xx-1) 


T,,(k\12. (k—1)) = X,..Xv) loge —j——-. 
aa a as ea 


Since by definition 
P(X | Xp- + Xp-1) S PO. - Xx), 
we get the results 


I, 
doy 


TI) +..+4,(N), 
Ty) +... + Lav). 


Accordingly, the information J, and the average information J,, 
obtained by the total system S are both smaller than the sum of the 
respective information obtained by its components S),.. ,Sy. 

This fact can be given the following verbal formulation. The functional 
organization of the total input x is expressed by the probabilities p (x) 
of the states x< X. Due to this functional organization the experience 
of the total system S of the distribution of the total input states x is 
greater than the experiences of a mere disorganized (uncoupled) collec- 
tion of its components S,,..,Sy: the experience of the total system 
contains even the knowledge on the probabilities of the co-occurrences, 
P(X%1Xo. .X,) (from k = 2 to N), in addition to the individual probab- 
lities p(x,),.., p(xy). Since the experience of the total system S as a 
structured whole is greater, the same total input state x contains less 
news for it than it contains for a mere disorganized collection of the 
components S,,.., Sy. Indeed, the more functional organization (coupl- 
ing) there is between the components of the total input state x, the 
farther is the probability p(x) from the situation 


(27) 


L\ IA 


P(X) = p(xy)p(%2). . p(Xy) 


which holds good for a disorganized (uncoupled) collection of the 
components S,,..,Sy. Only in the situation of disorganization we 
have the sum rule: 


i =f, ()+..+hy), 
Tie Feo) a AN): 


To sum up what we have so far said on the notion of input information 
we can construct a table which indicates that we have distinguished 


 — Cybernetic method... 129 


CHAPTER II 


between content and magnitude, and between certain content notions 
and certain magnitude notions: 


Input information: 


Content Magnitude 
notions: notions: 
Information Information Information Information Average 
quality pattern capacity magnitude information 
QO. PF. I I. (Information 
entropy) 
Tay 


We have defined these notions for the whole system S and for the com- 
ponents S,,..,S,, and we have studied the mutual relations of the 
notions referring to the whole with respect to the notions referring to 
the components. This is all that we have so far done. 

If we like we can decompose all our notions of information into 
respective internal and external information. This corresponds to the 
decomposition of the input vector x, of each component S, into the 


internal and external inputs, 

x, = Pa a x 
which was performed on p. 119. The internal input x'"' comes to the 
component system S, from the other components of the whole S, 
while the external input x*“‘ comes from the outer environment of 
the whole S. We can extend the decomposition to the total input vector 
by writing 


int ext 
x=x +x”, 


where 
xint | xert 
te | and xt = 
int oh 


We can then distinguish the internal informations Qi, Pi", 1, 
P=", received by the total system S from the internal input x", 
and coming from the components of S, from the respective external 
information QS, PS, Is", I&* which it receives from the external 


130 


CHAPTER II 


input x‘, coming from the outer environment of the whole S. 
In a similar way we can distinguish between the internal and external 
information received by the component S,. A closed system S of course 
receives only internal information. 

If our cybernetic whole S is a finite physical object composed of 
molecules we can derive an interesting connection between its informa- 
tion capacity (information entropy) and its thermodynamic entropy. 
It follows from the statistical theory of thermodynamics that every 
finite material structure can have only a finite number of physical states 
which were called by Max Planck the ’elementary complexions’ of that 
structure. The number yu, of the elementary complexions of the system 
S thus gives an upper limit for the number pu of the possible total input 
states of S: 

i Ug. 
For a material system S which has yp, elementary complexions the 
thermodynamic entropy © is defined by 


S =: k loge wo. 
The numerical value of the Boltzmann constant k is 
erg 
= —16 ; 
k = 1.38 10 PC 


where erg is an energy unit, and 1°C means one degree of temperature 
(Centigrade). 

We can compare the formula of thermodynamic entropy with the 
formula we get for the supremum information capacity sup I of the 
system S, if we assume that every physical state of S is able to function 
as a total input state of S: 


l 
sup I = logs Uo - log. po 


7 log. 2 
We can, if we like, identify the supremum information capacity of the 
material system S with its thermodynamic entropy. This gives a ratio 
for the units of measurement used when measuring information and 
when measuring thermodynamic entropy. If we call the unit of informa- 
tion used above a ”bit” (binary unit), we get: 


S erg 
=—— = x AGT 
1 bit sup I k log, 22 0,7 k 10 PC 


In this way we get for a bit of information as a physical equivalent 
a very little amount of energy per unit temperature. 


131 


CHAPTER II 


We shall consider the similarity of meaning between thermodynamic 
entropy and cybernetic organization in connection with output organiza- 
tion and organized action. 

We have in this section discussed the notion of information by defining 
it as a property of the total input states of a material system. From 
a purely formal point of view the same formalism could be applied 
to any other ’source of information’ and not only to the total input 
states of a material system. Such an application is given, for instance, 
by the theory of information of Shannon and Weaver. Of course one 
can make useful application of the notion of information in the context 
of many kinds of ’sources of information’. 


6 / Output Organization: The Cybernetic Notion of Action 


We shall now discuss the second aspect of functional organization 
in a cybernetic whole, viz. the organization of total output. We shall 
define the cybernetic notion of organized action in terms of the total 
output of a structured whole. 

Our study of output organization will be short, because the formalism 
developed in the preceding section can also be directly applied to the 
study of output organization. We have only to consider, instead of 
the total input, the total output vector 


1 


Ji 


Qs) y= |-* | =| + | , where y, = 


Nn 
When a structured whole S is in a given total output state y # 0 we say 
that it is in the ’action’ y . We shall first distinguish between notions 


of action content and action organization. 


132 


CHAPTER II 


By the action content of a given state y of total output we mean the 
combination of the action quality Q, and action pattern P, of this 
state. The action quality Q, is the set of all the output qualities (Y;) 
for which the magnitude yj is different from zero in the state y: 


Q, = {(Y)); yy # 0}. 
In other words, the action quality Q, of the output state y is the combined 
output quality which characterizes this state. The set Q, is a subset 


of the set Q’ of all the output qualities of the whole S and an element 
of the set F(Q’) of all the combined output qualities of S: 


Q,- QO, Q,« F(Q’). 

The set F(Q’) has q’ = 2” elements. We say this in the following 
way: the material system S has at its disposal 9’ different action qualities. 

Each of the component systems, S,, has only 9) = 2” combined 
Output qualities, and thus action qualities. The number q’ of the action 
qualities of the whole S is the product of the numbers of the action 
qualities of the components S,,..,Sy: 9 =4193--%y. The whole S$ 
has an action quality for every set of output channels composed of 
the output channels of its component systems. 

The action pattern P, we define as the distribution of the magnitudes 
of the non-vanishing components of y over the output qualities of the 
system S in the state y. We can indicate this definition by writing P, 
as the restriction of y to the vector components whose qualities are 
represented in the set Q,: 

Py = y| Q,. 

Turning then to the quantitative aspects of action we define first the 
action capacity of the system S. We remember that the number of all 
the possible total output states of the system S was v:v = #Y. We say 
that the action capacity C of the system S is the greater the greater 
is the number v of its possible total output states. We measure the action 
capacity C in terms of the number of the yes-or-no choices involved 
in the choice of a particular action y. Accordingly, we put 

2 =v 
so that 
(29) C = log, v. 


In a similar way we define the action capacity C, of the component 
system S, in terms of the number v, of its output states: 


C, = log. y,. 


133 


CHAPTER II 


Since v = v,v....v,, the action capacity of the total system S is the 
sum of the action capacities of the components: 


C = log, v, +..+ logs vy = Crt. t+ Cy. 


For a notion which corresponds to the notion of information 
magnitude J, we have no use in the context of action. In its stead we 
go directly to the counterpart of J,, in output organization. We assume 
that the functioning of the whole S is so organized that each particular 
action y has a given probability p(y). (Of course these probablilities, 
and also the probabilities p(x) of the total input states, can change in 
the course of time — we do not discuss the time process until Chapter ITT.) 
We then define the action entropy E of the system S by the formula 


l 
30 E= Oo 
(30) DP)! B 5G)” 
Then entropy Ps is non-negative and has its maximum value when all 


the probabilities are equal, i.e. when p(y) = 1/v for all ye Y. This 
maximum value is the action capacity of the system: 


0<E<C = logy. 


The action entropy £, of the component system S, is given by 


l 
i -) loge 
yor PW) - P(Qy,)” 


where p(y,) is the probability of the output state y, of S,. Its maximum 
value is the action capacity C, of the component system S,: 


0<E, <C, = log... 


The relation between E and £,,..,£, is obtained by applying the 
formalism on pp 128—129. The result is 


E =£,+ E(2|1) + E(3|12)+..+ £(N[12..(N—1), 
EsE, +E, +E3t+..+£y. 


Here we have denoted 


E (k{12. .(k—1)) = 2, P(1--Yn) logs 


(31) 


P (Vu |Yr- + Yes) 
Let us consider the significance of the action entropy E£. For this 
purpose, let us study the functioning of the system S during an interval 


of time which is long enough to bring the frequencies f(y) of the output 
states y close to the probabilities p (y) :f(y) ~ p(y) for each ye Y. Then 


134 


CHAPTER II 


the entropy E calculated on the basis of these probabilities characterizes 
the degree of organization versus pure chance appearing in the actions 
of the system S during this interval of time. This can be shown by the 
following consideration. 

Let us consider a situation in which the entropy E has its maximum 
value C: 


E = > p(y) loge Fisk = logs v. 


yeY P(y) i: 
This corresponds to the case in which all the probabilities p (y) are 
equal, i.e. 
P(y) = I/v for all ye Y. 


Accordingly, the appearance of a given action y seems to follow from 
pure chance: there are no regularities, no organization in the acting 
of the system S during the interval of time in question. 

Let us consider the conditions under which the equality of all the 
probabilities p(y’) is realized. The equality of the probabilities p(y) 
for all possible ye Y implies the equality of the probabilities p(y,) for 
all possible y, « Y, for every component system S,. Thus the maximum 
value of the total entropy E pressupposes the maximum values of the 
individual entropies E, of the components S;,: 


1 
E, = -) loge ——~ = max E, = log, »,, 
eye © ) = P (y,) ze 


because 
P(y-) = l/y, for all y,e Y,. 


However, this condition is not a sufficient one for the maximality 
of the total entropy E. To reach the maximum value of £ the components 
S;,..,Sy must be uncoupled with one another so that the whole S$ 
completely disorganized. Under this condition we get 


P(Y) = PO) PO2)--- PO) 
which gives the sum rule 


E = FE, +F,+..+ Ey. 
When this is true we get 


E = max £,+..+max Ey = log,v, + .. + log, w= 


= logey = Emax: 


135 


CHAPTER II 


Thus in order that the actions of the total system S were completely 
disorganized must 1) the actions of each component system S, be 
disorganized, and 2) the total system S must be a disorganized whole 
of its components, so that the components S,,..,S, are uncoupled 
with one another. This is the situation characterized by the maximum 
value of the action entropy E. The same situation is characterized, 
of course, by the minimum value of the action negentropy —E. 


The value of the action entropy may thus decrease, or the value 
of the action negentropy —E increase in two ways: 


(1) by increasing the organization of the actions of the component 
systems S,, indicated by the increasing negentropies —E,, or 


(2) by increasing the organization of the total system S as a whole 
of its components S,, ..,S, which is indicated by the increasing differ- 
ence 


AE=£E£,+..+Ey—E. 


We can thus say that the value of the action negentropy —-E measures 
directly the degree of organization in the total action of the whole S: 
the value of —E increases with the increasing organization of the total 
action. We can also say: the more organized the action of the whole S$ 
is, the greater is the value of —E. 

The maximal organization of the action is reached when E = —E = 0. 
This is the maximum value of the action negentropy —E. 


Example. To illustrate the entropy considerations let us calculate the action 
entropy of a whole S composed of two components S, and S,. If both of them 
have three output states, we can write their individual entropies as 


E, = — p,logsp, —pslog: Pp: — palogsPs, 
E, = —S,log,s, —s,log,s, —s3log,ss. 


We choose for the probabilities pj and s; the following numerical! values: 
(PuPy»Ps) = Cly “eo “es 
(Si, 5a, 5a) = Clas */u 7/2). 

Thus the numerical values of the individual entropies are in this case equal: 
E, = E, = (?/,)log,2+2(?/log,4 = 3/2. 


Assuming first that S, and S, are uncoupled with one another the probabilities 
Pix of the co-occurrence of the output states of S, and S, are given by Pip = Py. 


This gives the numerical values 
Pr = (1/8, 1/8, 1/4, 1/16, 1/16, 1/8, 1/16, 1/16, 1/8) 


136 


CHAPTER II 


Hence the action entropy of the total system Sy where the components S, and S, 
are uncoupled with one another is 


3 3 
Eo i. > Z Pix loBa P i, = 3, 
J=1 kel 


Thus we have indeed the sum rule in this case: 

Ey = Ey + E:. 
Let us then introduce a coupling between S, and S,. This is reflected in the probab- 
ilities Pi, aS a probability distribution which deviates from the above one. For 
instance, we van choose 

Pip = (1/4, 1/4, 1/8, 1/8, 1/8, 1/32, 1/32, 1,32, 1/32). 
This gives for the total entropy of the system S the numerical value 

E=2¢ <3 = Ep. 


The functional coupling between the components S, and S, thus decreased the 
total entropy, while the negentropy — E increased from —3 to —2?. 


To sum up what we have said so far about output organization we 
can construct the following table to indicate that we have distinguished 
between certain notions related with the content, on the one hand, 
and the organization, on the other, of action: 


Action: 
Action content: Action organization: 
———— ee 
Action Action Action Action entropy E or 
quality pattern capacity action negentropy 
Q, P; C —E 


We have defined these notions for the whole system S, and for the 
components S,,..,S,. We have also studied the mutual relations 
between the notions referring to the whole and the notions referring 
to the components. This is all we have done so far. 

We can if we like speak of internal and external actions y (and y,) 
according to the fact whether the output y (or y,) goes to some com- 
ponent(s) S or to the outer environment of S. However, the same output 
y(or y,) may influence both some component(s) of S and the outer 
environment of S. Thus we cannot decompose the output y (or y,) in 
a similar way as we decomposed the input x (and x,) to a sum of internal 
and external parts. Only if we know, for instance, that a given action 


137 


CHAPTER II 


y (or y,) acts either only on some components of S or only on the 
outer environment of S, we can calculate with y'™ or y' (and with 
y™ or y&"). Otherwise a quantitative distinction between internal and 
external action is impossible. 

For a molecular physical system we have again the number uy, of 
the elementary complexions of the system S. The number of the ele- 
mentary complexions of the system S is as well the upper limit of the 


number v of possible output states: 
vi < Po. 


By comparing the possible values of the action capacity C with the 
value of the thermodynamic entropy S of the system S we get a theorem 
concerning the supremum action capacity: 


sup C = sup E,,, = S = k log, up. 


The action capacity, or the maximal action entropy E,,,, of the system 
S reaches this supremum value, if the system S is able to use each of 
its uw» physical states (elementary complexions) as a possible total output 
state y. 

Just as the action entropy E (or rather the action negentropy —E) 
measures the degree of action organization of the system S as an orga- 
nized whole, the thermodynamic entropy (or rather the negentropy — 
S) measures the degree of physical organization in S asa physical system. 
The number yp, of the elementary complexions depends on many 
variables characterizing the physical system S, e.g. its temperature, 
total energy, etc. Just like the action organization the physical organiza- 
tion too can change in the course of time. There is a general law, called 
the Law of Entropy (see p.92), according to which § increases in every 
closed physical system S$. Accordingly, in a closed physical system 
organization decreases. In a cybernetic system — which by our definition 
is always open — the organization as measured by —E may increase 
as a consequence of the interaction of the system with its environment. 


7 / Levels of Organization 


In the preceding sections we have considered a cybernetic whole S 
which was composed of N components S,,..,S,. Usually, when stu- 
dying a whole S, it is useful to decompose it to a few first components 


1 ] 
Serene) 7a 


138 


CHAPTER I! 


and then each of these, S‘”, to a few second components 
then each of these, S, to third components 
ST SBN, a 

etc. until the components S,,..,S, are reached. In this way the study 
of organization of S reduces to the study in different /evels of organiza- 
tion. The first level of organization is formed by the coupling between 
the first components Sj,,,..,Sy). The second level of organization 
comprehends the studies of the mutual couplings of the second compo- 
nents S“),.. , SiN,, Within each of the first components, and so on. 
Levels of organization are illustrated in Fig. 23. 

At each level of organization we have to study the inner organization 
of a number of organized wholes, say S;,..,S,. The coupling matrix 
of one of them, S;,, is given by some matrix of the form 


oe ee ee ee © we we we we we ee we 


where M is the number of the components of S,. The coupling matrix 
of the ‘higher order whole’ S' composed of the wholes Sj,...Sx will 
then be of the form 


Cu Cie Cik 

Ca Cre Cox 

CP, | ace aide cite Wee ela oe 

Cri Cre Ck 

S 
SO Sy? So 

—_—_—_—_ Oo —_———— EN 
sp SR OS® SQ SS se 


S ir S {?, S Oi S or S M S Te S of Ss 7 S 7, S 9 S a S %; S me 
Cob oe to oF ft ot we tf 
S} S2 S3 Sy Ss Se S; Ss Sy Sio Su Sie Sis 


Fig. 23. Levels of organization of a whole S composed of 13 components S,, . . ,Si3. 


139 


CHAPTER II 


Here the matrices Cy, for j#k contain the couplings between the 
wholes S;,..,S,, while the matrices Cy, contain the couplings 
within the respective wholes Sy. 

The action operator 7’, the total input vector x’, and the total output 
vector y’ of the higher order whole S’ are simply the ’direct sums” of 
the respective operators or vectors of the wholes S;,.., Sx: 


T, 0 0 xy yi 

0 T, 0 Xo yo 
i en eee eee ere ee ke a 

0 0 ei Tx XK YK 


Thus we can study both the structural and functional organization 
separately at each level of organization. 


140 


CHAPTER III: 


Cybernetic Theory of 
Self-Generating Processes 


l § The Self-Generating Process 


1 / General 


The world of dialectical materialism is a changing world. Dialectical 
materialism asserts that material reality is in a ceaseless state of change: 
it is self-generating, ’self-moving’ for ever. The self-generated change, 
or motion’, has a certain direction. Such a change is called development. 

The world is pushed into motion by dialectical contradictions which 
are constantly formed and solved, and which generate themselves. 
The dialecticz] contradictions, which of course must be strictly disting- 
uished from logical contradictions, are formed and solved as a conseq- 
uence of causal interaction between the different parts of material 
reality!*. Referring to the self-generating dialectical contradictions as 
the causal factor of development we can say that development is a 
self-generating dialectical process. 

The dialectical contradictions are either antagonistic or non-antagonistic 
by their nature. The non-antagonistic contradictions can be removed 
and conciliated within the system where they are formed. The antagonistic 
contradictions break the system, and thus open a completely new line 
of development, which builds a new kind of system upon the ruins 
of the old one. 


14. We are speaking here of dialectical processes ocurring in the objective, mate- 
rial reality. In addition to these processes dialectical materialism discusses 
processes of consciousness, which are reflections of the processes of mate- 
rial reality in human consciousness. 


141 


CHAPTER III 


Corresponding to the two kinds of dialectical contradictions, non- 
antagonistic and antagonistic, a typical dialectical process is developed 
through a succession of two phases. First, there is a phase of gradual, 
accumulating development whose driving forces are both the antagonistic 
and the non-antagonistic contradictions within the system in question. 
During the period of gradual development the antagonistic contradict- 
ions, if such exist in the system in question, are sharpened until, as a 
consequence of their existence, an abrupt, qualitative change occurs in 
the system. In such a ’jump’ the main qualitative characteristic of the 
system, viz. its structure and its mode of action, are abruptly changed. 
Hence, a new period of gradual development begins within the new 
system, leading to ever higher forms of organization. 

A material system where a dialectical process of development occurs, 
as a consequence of the internal contradictions of this system, is often 
referred to as a ’whole’ in dialectical materialism. The internal contradic- 
tions of a whole are due to causal interactions between the parts of 
the whole in question. External contradictions, that 1s, contradictions 
due to causal interaction between the whole and its outer environment, 
may also contribute to the determination of the direction of the process. 
However, the internal contradictions, build within the structure and 
the mode of action of the whole, are the necessary moving forces 
which push the dialectical process into movement. 

The whole of the universe can be considered as a whole where dia- 
lectical development occurs. However, the development of the universe 
as a whole is very slow. There are long periods of gradual development, 
in the course of which the mode of action of the universe, the natural 
laws, are changing very slowly. 

Much more rapid is the dialectical process in the living population 
on the earth. The development of species shows periods of gradual 
change as well as abrupt changes. A striking qualitative change was 
the first appearance of human beings, capable of making tools and 
of cooperation, on the earth. Human societies, based on social relations 
formed by the cooperation and by the mutual communication of human 
beings by means of language, formed wholes which began dialectical 
processes themselves. The dialectical development pushed into motion 
by the internal and external contradictions of human societies is called 
the historical process of mankind. 

In the course of the historical, dialectical process of development 
the structure and mode of action of the social formations of human 
beings undergo both gradual development and abrupt changes. The 


142 


CHAPTER II 


former are called periods of evolution, the latter periods of revolution. 
Both evolution and revolution are necessary phases in the development 
of human societies. 

We shall study in this chapter the formation of internal contradictions 
and dialectical process in a mathematical model. Such a model is offered 
by cybernetic method. However, we shall study here only a restricted 
aspect of the dialectical process: the structural and functional conditions 
under which a dialectical process is started in a whole with a given 
structure and a given mode of action. Thus we get a classification of 
internal contradictions, and a classification of the processes started 
by these contradictions. We shall not study the whole succession of 
the dialectical process composed of periods of gradual development 
and abrupt chang es, leading to qualitatively higher forms of development. 
For a study of the latter problem the present cybernetic methods are in- 
sufficient. (A promising future approach to this problem seems to be 
in von Neumann’s theory of the development of localized automata: 
see the remark on the role of complexity as a factor producing qualitative 
changes leading to higher forms of development, p. 169). 


2 / Dialectical Contradiction Within a Cybernetic Whole 


Let us consider a cybernetic whole S composed of the components 
S,,.., Sy. Our purpose is to study the internal conditions for the 
formation of dialectical process within this whole. To simplify the 
study we shall neglect irrelevant factors, and thus consider a whole S 
which is reduced to its ’active elements’ coupled with one another. 
This means the following. 

Each component 5S, is a cybernetic system having m, input channels 
and 1, output channels but only one inner state which never changes. 
Such a component is called an active element, since it does nothing 
but transforms a cause (the input) in a strictly deterministic, unique 
way to an effect (the output), thus changing its environment. The action 
Operator 7, of an active element is thus a function, and associates 
with each input state x, one and only one output state y,: 


(1) y, = T,(x,) (one-to-one). 


Letting the channel along which an input comes determine the quality 
of the input in question, we can represent each input state x, as an 
m,-component real vector, whose component x, indicates the magni- 
tude of the numerical value of the input coming along the j“ channel, 


143 


CHAPTER III 


or the numerical index of the state component in question. Similarly, 
we represent each output state y, by an ,-component real vector 
(cf., p. 118). 

In general, a given input state x, can be expressed as 


N 
ext 
x, = > Cet Xr 


s=1 


-CXt 


where C,, is the m,xn, coupling matrix from S, to S,, and x" is 
the external input (cf. p. 119). However, we shall study the internal 
process, and put 


xe* == 0 


for the whole time during we are observing the process (we assume that 
the system S has received earlier some external input, and we are now 
studying the internal process originated by it together with some internal 
background). Hence: 


N 
x, = > CrsVs - 
s=1 


The total input vector x of the whole S has m = m, +. .-+ my com- 
ponents, and the total output vector y has nm = 1,+..+”, components. 
Written for them the coupling of the elements of S is expressed by 


(2) x =Cy. 


Here C is the mxXn matrix composed of the matrices C,,, and called 
the total coupling matrix (cf. p. 120). 

Each individual action operator 7, is a vector of , components. 
We can construct out of them the WN quasi-diagonal matrix 7, the 
total action operator of the whole S (cf. p. 121). The total action operator 
T indicates how a cause x acting upon the elements of the system S$ 
brings forth the effect y. On the other hand, the total coupling matrix C 
indicates how such an effect y acts as a new cause upon the elements. 
Thus T and C together represent the causal process occurring within 
the whole S. 

Let us now pick out a certain total input state x, and study its effects. 
We shall call x the primary cause. In the internal, causal process occurring 
in the system S the primary cause is first transformed to the primary 
effect y: 


(3) x —> y= T(X). 


144 


CHAPTER III 


Then the primary effect is transformed to the first derived cause x: 
(4) y—>x=Cy. 

Combining the two formulae, we have the elementary causal act 

(5) X —> x =CT(x) = RQ). 


We shall call R = CT the causal operator of the whole S. It indicates 
how a primary cause is transformed into the first derived one in the inter- 
nal, causal process of the whole S. The internal, causal process occurring 
in S can be thought of as a succession of such elementary causal acts. 

If the first derived cause equals the primary cause, we say that the 
whole S is in a State of equilibrium: 


X == X (a state of equilibrium). 
If this is not the case, S is in a state of change: 
xX # X (a state of change). 


If the whole S is in a continuous state of change, which is never stopped 
for cybernetic reasons (that is, neglecting the energy consumption 
of the process which may stop it), we say that there is an internal dia- 
lectical contradiction within the whole S. The non-stop internal, causal 
process which is so induced within the whole S is called a se/f-generating 
process. Thus a dialectical contradiction within a whole S always causes 
a self-generating process in this whole and, vice versa, a self-generating 
process is always due to an internal dialectical contradiction within 
the system in question. 

The internal dialectical contradiction, if it exists, is said to be between 
the elements of the whole S in question. Or, more exactly, it is said to 
be between the inputs and the outputs of the individual elements of S. 

To explain this terminology, let us consider a whole composed of 
only two elements S, and Sj. If there is an internal dialectical contradic- 
tion within the whole S, the two elements S, and S, can never be simul- 
taneously in equilibrium. For, if S, is in equilibrium so that x, = X,, 
it follows from the condition x # x that there must be x. # X,. Thus 
the element S, "denies’ the equilibrium of the element S, and, vice 
versa, S, if it is in equilibrium denies the equilibrium of S,. This mutual 
denial represents a dialectical contradiction. If this term were not 
used, we would have to invent a new term expressing the same thing. 
So, for instance, the English cybernetician W. Ross Ashby introduced 


10 — Cybernetic method... 145 


CHAPTER III 


the term ’power of veto’ for dialectical contradiction as a driving force 
in the self-generating development of a cybernetic system!’. However, 
in dialectical materialism no new term is needed. 

The course of a self-generating process is determined by the causal 
operator R of the system or, in Marxist terminology, by the mode of 
action of the system. This in turn is determined by the matrices C and 
T, or by the structure of the system, while the action operator T indicates 
the modes of action of the elements. 

Causality, expressed by the causal operator, or the mode of action 
R of the whole S, was defined above, just as it was defined originally 
for a material cybernetic system (cf. p. 94—96): without an explicit 
introduction of time. This is a characteristic trait of cybernetics as 
a method of dialectical materialism: in cybernetics, and in dialectical 
materialism in general, causality is the primary thing. 

Note. Instead of using the causal operator R = CT we could as well 
have used the causal operator P = TC in our analysis of self-gencrating 
processes. Then we would only study the output process y(t) instead 
of the input process x(t). For convenience, we shall use P instead of R 
In our analysis of production process in Vol. II (because y has less 
components than x, and is thus easier to handle in computations). 


3 / The Inner Law of Motion of a Whole 


We shall now introduce time explicitly into the mode of action of a 
whole S. The new form of the mode of action so obtained is called 
the Inner Law of Motion of the whole. The term was introduced by 
Marx?*, 

Let us consider the effect of a primary cause x appearing in the whole 
S at the moment ¢. If the primary effect y is a sudden act it appears 
at a definite moment t+ 6 of time not before the cause x:0 >0. We 
can in this case write the connection between cause effect as follows: 


(6) y(t+6) = T(x(t)), 6 >0 (sudden effect). 


If the effect of x is gradual, it is distributed over some interval (f, 
t+ 6) of time, and we can write: 


(7) ytt+7) =T(t), 1) #0 for O< +< 6 (gradual effect), 


15. W. Ross Ashby, Introduction to Cybernetics, London 1956, p. 100—101. 
16. Karl Marx, Capital, Vol. I, Moscow 1954, p. 10. 


146 


CHAPTER III 


To indicate the distribution of the effect y over the interval (t,t + 8) 
we have written the action operator T to depend explicitly on the time 
+ passed since the appearance of the cause x. We call ® the reaction 
time of the whole S. It will be assumed constant in our discussion. 

Assuming that the primary effect y (t+), or y(t+'), is immediately, 
at the moment ¢+6 or t+ 1, effective as a first derived cause x (t +6), 
or x(t+ +), without any loss of time in the channels of coupling, we have: 


x(t + 0) = CT(x(t)) = R(x(t)) (sudden effect), 


(1) ) x(t+ +) = CT(x(t),t) = R(x(t),t) #0 for O< + <6 
(gradual effect). 


This is the first form of the Inner Law of Motion of the whole S. Of course 
we could have introduced some further parameters to indicate the 
times passed by the primary effects y, in the channels of coupling before 
theys are transformed to the first derived causes x,. This, however, 
would be an irrelevant complication from the point of view of our 
present study. 

A second form of the Inner Law of Motion is obtained by writing 
the above formulae (I) for the first derived cause x(t) instead of x(t+6) 
or x(t-+71) . Then we obtain: 


x(t) = CT(x(t—6)) = R(x(t—8)) (sudden effect), 
™=0 
x(t) =C > T(x(t—*),D= > R(x (t—*),t) (step-wise 
(If) “4 i gradual effect), 


x(t) =C J T(x(t—+),t) dt = J R(x (t— *),t)d= (continuous 
gradual effect). 


We see that the Inner Law of Motion, whether given in the form (I) 
or (II) or for a sudden or a gradual effect, is a recursive formula. It re- 
presents the causal recursion by means of which the values of x for 
any moments t >z of time can be calculated from the values of x in 
the interval (z—9, z). The values in this interval can be called the initial 
values of the process. We can choose the moment z and the initial 
values of the process arbitrarily. The segment of the internal process 
x(t) in any interval (z—6, z) then determines uniquely the course of 
the internal process for all the future after the time z. 


147 


CHAPTER III 


The causal operator R(x) is an m-component vector function of 
the »1-component vector variable x. If R is analytic in an environment 
of a point x, we can expand the value R(x) in the Taylor series (cf. 
p. 65) 2’: 


(8) R(x) = R(S) + (=) G=sy4 
a*R -_ x 
(x — x) a. Ca ae 


which converges, and represents the function R(x) in some environ- 
ment of the point <. This gives the possibility of relating, if the necessary 
conditions of analyticity and convergence are fulfilled, any two solutions 
x(t) and x(t) of the Inner Law of Motion, (II), by means of a power 
series in x(t—6) —x(t— 6), or x(t— 7+) —x(t—<), respectively. We 
get for sudden effect: 


R i 
(9) 0-30 = 5 & [x(t—6) —S(r—0)] + 


22 


Seery, ~ Texte —6)— x(t— 8)]’ ites 


ax? (t — 


5 | [x(r— 0) —x(¢— 9)] 
seg 


Similarly we get for step-wise gradual effect: 


™n=0 


(10) x(t) = x(t) + 2 eo 5 pe) — x(t—+*)] + 
1 ‘ oT. a2R 
ae rT Oh ek) ; mer — ia 
[x(t— +t) —x(t—>)] + 


And for continuous gradual effect: 
0 


(11) x(t) = x(t) + laws] & [x(t— +) —x(t—)] dt + 


ax(t 


1? ; [aR 
+a | PC—)—<0— 9) oa) | 


[x(t —t)—x (t—*)] Ca 


17. @R 


ax® is a tensor of third degree, as a "tree-dimensional matrix”. 


148 


CHAPTER III 


When written for the difference process 
Ax(t) = x(t)— x(t) 
this reads: 


aR a?R 
Ax(t) = aed —o ~0)+ 5 , Ax’ (t— ies x(t — a] ; 
Ax(t—6) + ... (sudden effect), 


eR ls = 
Ax(t) => ee! Ax) +5, D ax) ae 5| 


_A 
xx 


(12) Ax(t—+) +... (step-wise gradual effect), 


6 
Ax(t) = | Poort : ax—)d= + 5 Ax’ (t—7) 


éx(t — 7) 
x= 


32 
ao - Ax(t—-+z)dz+...(continuous gradual effect). 
x= x 

In the domain where the Taylor series of R converges, and represents 
the function R(x) one can calculate, by means of the equations (12), 
all the later values of the difference process Ax from its initial values 
in an arbitrarily chosen interval (z—0,z). 

In particular, for a differential process 


Sx(t) = x(t)—x(t), 
where 5x(f) is for all ¢ so small that the magnitudes of the second order 
In 5x(ft) can be ignored, we get: 


dx(f) = Foxes Pie, (sudden effect), 


(13)% 8x) = a eee 5| - Sx(t—<) (step-wise gradual effect), 


=X 


Sx(t) 


aR 
| Fad ; oe (continuous gradual effect). 


Let us study the matrix aR/ax appearing here. Since R = CT, and 
since C is composed of the coupling matrices C%, and since T is a 


149 


CHAPTER Iil 


quasi-diagonal matrix composed of the individual action operators 7, 
we can represent R in following form: 


Ry Ris - -Rin Cu C2 cw T, 0 


Ry Ryewec Run ON Gh CM 0 Ts 


Here each sub-matrix R,, = C”’T, has m, rows and one column: 


Ch Cie. Ins Ts (R;s)1 
a4) R= | cc VT | _ | Rede 
Car phar : Ca Ti: (R,s)m, 


Each component 7,, of the action operator 7, of the element S, is a 
function of all the components x,,, X59, .., Xsm, Of the input vector x, : 


Tsk a T34(Xs1 Xsor-- Ae pore 


Accordingly, each component (R,,), of the sub-matrix R,, has a 
derivative, with respect to each component x,, of x,. This derivative 
can be written: 


a F) 
(15) —= = C.- 
OXsh OX sh 2, " OX sh 
The derivative of the vector function R(x) with respect to the vector 
x is the matrix composed of these derivatives: 


aR 
ox ri,sh 


As indicated above, the rows of the matrix a2R/ax are shown by the 
double index (ri), and the columns by the double index (sh). The matrix 
aR/ax has, obviously, m = m,+ m,+..-+ my rows and equally many 
columns. 

When substituting the matrix elements of aR/ax to the formula 
(13) we get the corresponding equations in a component form: 


07) 8x =3 4 Perper 


etc. 


aR 


aR, 
(16) — =| 


OXsh 


= mxXm. 


aR,, 


ax,4(f — 5 ; — 6) (sudden effect), 


150 


CHAPTER III 


2 § Cybernetic Systematics of 
Self-Generating Processes 


1 / The Study of Internal Contradictions ’in the Small’: 
Cybernetic Categories of Contradictions 


Let x(t) be one of the solutions of the Inner Law of Motion, (1) or 
(II), of the whole S. It then represents one possible internal process 
occuring in this whole. 
Denoting the interval (z—®,z) in the calendar K of the whole S 
by Kio 
Ki9 = {t;z—0<t<z}cK, 


the initial values of the process x(t), when taken in the interval K?_», 
are represented by the time segment 


x|Ki_, (the initial values of x for the moment 2z). 


The initial values x| Kg for the moment z then determine uniquely 
all the later development of the process x(t) for ¢ 2 z. 

If x(t) is another solution of the Inner Law of Motion of S, the initial 
values x|Ki.9 for the moment z determine uniquely all the later 
course of the process x(t) for f=z. In particular, the initial values x| Ki, 
and x| Ki. together determine uniquely the course of the difference 
process Ax(t) = x(t)—x(t) for t2z. 

If the necessary conditions of analyticity and convergence are ful- 
filled!®, the later values of the difference process Ax(t) can be calculated 
by means of the power series in the equations (12), p. 150, when the 
initial values X|K?., and x|Ki are given. If the difference (x—x)|Kt- 
of the initial values is small enough, the values of the difference process 
Ax(t) can be calculated, for some sufficiently small interval KZ** 
of time, from the formula 


18. Let it be here emphasized, once and for all, that a corresponding discussion 
of the effects of a disturbance of the internal process of a whole could be performed 
without the assumptions of analyticity and convergence related to the causal 
operator R(x). However, the discussion would be mathematically complicated 
and uneasy. Therefore, we restrict ourselves to a model where the causal operator 
R(x) has the necessary qualifications to allow an analytic treatment. 


151 


CHAPTER III 


aR 
ee cen zte 
Ax(t) = E (t sls Ax(t— 9) for te K? (sudden effect), 


t=z 


: aR (step-wise 
A —_< zt+e 
(18) x(t) = 2 ies 5s Ax(t pron eeh: gradual effect), 


N 


6 

oR (continuous 
A = Ss ~Ax(t—-<)dz z+e 
ae hese] ene gradual effect). 


Here the derivatives, as indicated, are taken at the moment z of time. 
Let us now introduce a vector function f, defined by 


f(t) = Fao i f(t—8®) forall t =z (sudden effect), 


(=z 


ols (step-wise 
(1932) = val eel eee for all r =z a hea 


ax (t — gradual effect), 


° ° 
f(t) = {| aR ~ ere sir Seetonaige (continuous 


ax(t — gradual effect). 
We have 
(20) FAt) = Ax(t) for te K2t*, 


while for the times t=z-+ « the values of the function f/, may differ from 
the values of the difference process. We call f, the trend function of 
the difference process Ax(t) "in the small”, that is, in a case where 
the initial values (x—x)|Ki_,are small quantities of the first order so 
that their second powers can be ignored. 

By means of the trend function f, we can study the response of the 
system S to small disturbances of its internal process x(t). By a small 
disturbance of the process x(t) we mean a displacement of its initial 
— (x+ Ax)| Kz», such that Ax = x(t) — x(t), where 
x(t) is also a solution of the Inner Law of Motion, and the values 
Ax|Ki_, are small quantities of the first order. 

We shall now study the response of the whole S to such a small di- 
sturbance of its internal process x(t), performing the solution in more 
detail in the case of sudden effect. 


1. Solution for Sudden Effect. For convenience, we shall write in short 


aR aR ‘ ia 
aay Woe an ary = MXM Mat¢rix. 
ax(t + 6) pati Ox} 2 


152 


CHAPTER III 


The defining equation of the trend function then is for sudden effect: 


aR 
ay £0 = (F) 40-9. 


The elements of the matrix @R/ax being here independent of the 
time ¢ (though they depend on the moment 2), this is an ordinary differ- 
ence equation for an m-component vector function. The substitution 
f(t) = wr"°, where w is a constant m-component vector and A is a 
number, yields: 


aR eee 
wll? = |—] wa? 
OX]; 


/0 


Here the factors 2%" cancel each other, and we are left with the eigen 


value equation 


éR aR 
(22) ("] w= AW, OF (3) — pd | w = 0. 
ox], Ox}, 


Here 7 is the 72 xm unit matrix. 
Solutions w #0 exist only when the determinant of the charac- 
teristic matrix vanishes: 


éR 
(F] — a] 0. 
ox], 


aR 
This is the characteristic equation (cf. p. 52) for the matrix (=) : 


¢ 


(23) 


The left-side member is a polynomial of the m™ degree in the unknown 
2. Accordingly (cf. p. 79), we have m roots: 


Dasa. ee 9 Davee 
Each of them, of course, is a function of the parameter z: 
xj = (Zz); J = 1,2, eee » mn. 


For each non-multiple root 4, we have exactly one solution (,,1,), 
apart from an arbitrary constant factor in w,, of the eigen value equation. 
The eigen vector w, of course depends on z too: 


Ww, = W(z); f = 1,2,..,m. 


We get, from each such solution (,,;) of the eigen value equation, 
a uniquely determined solution /,(t); = wya/!® of the equation (21) 
(up to a constant factor). Assuming that there are no multiple roots 


153 


CHAPTER III 


the general solution of the difference equation (21) thus is: 
F(t) = > wry’, 
j=1 


Each root ), is either a real or a complex number, and thus can always 
be written (cf. the Euler formula, p. 64) as 


4 = re! = 7(cose,+isin »,). 


Here r, is a non-negative real number, viz. the module of ,: [>| =f; 
The real number », is the argument of A,: arg 4, = »). 

The root 4, is real for », = 0, ~,.. The eigen vector w, is of course 
then real too. For », # 0,1 x we have a complex root ;. In this case, 
obviously, the eigen vector w, is complex too. Let us write: 


w= a, +ib,, 
where a, and 5, are real vectors. 
For each complex root 4, the conjugate complex number is also 
one of the m roots (cf. p. 79). Let us write 45 = 2,. The sum of the 


j and the k” terms in the solution f,(t) can then be made real by 
choosing w, = w,, as can readily be seen from the condition that 


(way? +w ryt)" = wil? + wry? 
Then we obtain: 
wri? = w,r, 8 aS 
= 2ari’cos wjt/8 + 2brf? sin «,t/6 = real. 
ris 


In this way we get a real solution for f(t). Since sin u = cos ( u— 5) 


we can write this solution in the form 
wt 
ft) = > wry” cos (2—») 


Here each vy, is a real m-component vector, and g, is either 0 or 7/2. 
If there are multiple roots among the 2,, we have only to add a factor 
t*/ to get the real solution for f(t): in this case we have 


(24) SAt) = > wt (ae Cos (= — ®) ° 
j=l 


Here k, is zero for every single root 4,. For an -fold root 47, = 4, = 
=...= ys. we have to put ky) =n—l. Thus we have the powers 


154 


CHAPTER III 


t°,¢1,72,..,t7-1 in the corresponding term of /,(t). Since the formula 
comprises also the case in which all the roots are single roots (this 
corresponds to k, = 0 for all j), this formula gives the general real 
solution of the equation (21). 


2. Systematics of the Internal Process. We can now discuss the response 
of the whole S to a small disturbance of its internal process x(t) during 
an interval K%_5. We know now that this response is determined 
by the properties of the eigen values of the matrix (@R/ax),. We shall 
distinguish four mutually exclusive cases related with the information 
contained in the formula (24) as follows: 


(1) all, =0: 
no information on /,(t); 


(2) all |x] = +7, <1, and at least one 2, # 0: 
then f,(t) —> 0 when t ~~; 


(3) only single roots for which |,,| = 1, for others |2,|< 1: 


w st 
then /,(t) —> > v, cos (=+ — @) when t —>o; 
J 


(4) at least one at least double root for which a, =l, 
or at least one root for which |),| > 1: 
then f,(t) —> © when to. 


Let us study first case (1). In this case we have destroyed our possibility 
to get information on the trend function f, by making the substitution 
S(t) = 0 from the very beginning. Thus the "result” f(t) = 0 obtained 
from the formula (24) in this case does not tell anything but our original 
substitution. 

We must study, in case (1), directly the properties of the matrix 


(=) (=) 
=~ =]cl=)] . 
Remembering the quasi-diagonal form of the matrix T we can represent 


_ {aR ae : 
the matrix (=) as a composition of the m,xm, matrices 
rs 


R T; 
oo []--@. 


155 


CHAPTER III 


Here C™ is the m,xn, coupling matrix from the element S, to the 
element S,, and (@7,/ax), is the n, Xm, matrix composed of the derivatives 
of T;, with respect to the components <x,,(z— 9). 

Putting 


aR 
(=). = 0 for all r2s, 
ox], 


the matrix (@R/2x), will have the following form: 
) R 


aR 0 
eo or 
0 0 

In words: there may be non-vanishing matrix elements only above 


the main diagonal of the matrix (@R/ax),. It follows that the characte- 
ristic matrix has the form 


> R 


=~} —a- J= 
ax}, 


0 ta 
fe AL) = Lex 
ax}, a a a 


—x~|]—a-l 
Ox 


equivalent to the condition that »” = 0 or, what is the same, that all 
the eigen values 4; are zero. 
Now the equations 


aR 
(26) (3) | =0 forall r>s 
Ox } z rs 


are valid, 


Hence, 


Accordingly, the condition that = 0 is in this case 


(1.1) if C’% = 0 for all r= s, or 
(1.2) if some of these C”™ are different from zero there is, 
however, 
Cr al 0 for all 
A op ae — = * 
2, (¢—6) |. orallr2=s 


In case (1.1) there are no circuits of feedback within the whole S. 
We can see this as follows. If there is a circuit of m elements, say, 
S,, > 5S,, > +--+ S,, = S,,, then all the coupling matrices C1”, 
C’2"3,...,C™1"1 must be different from zero. The condition for this 


156 


CHAPTER III 


IS ry< re<rg<...<r, =r, which is a logical contradiction. Accord- 
ingly, the condition C’ = 0 for all r=sis incompatible with the existence 
of a circuit of feedback within the whole S. On the other hand, if there 
is no circuit of feedback in S, we can always order the elements S,, . . ,Sy 
so that there will be C% = 0 for all r=s. Thus case (1.1) is equivalent 
with the case of no circuits of feedback in the whole S. 

In such a case there can be no continued internal process x(¢) within 
the whole S. The internal process x(t) represents, in this case, a mechanical 
reaction of the system to an external input, and the process stops after 
the external output, the response, has been given. Since a mechanical 
reaction 1S no continued, self-generating process it does not belong to 
dialectical processes, according to the definition of the latter (see p.146). 
Neither is there any dialectical contradiction within the whole S, capable 
only of mechanical reactions. 

In case (1.2) some circuit(s) of feedback exist, and thus the whole 
S is capable of continued, self-generating internal process x(t). However, 
it follows from the condition that the self-generation of the process 
x(t) is at the moment ¢ = z of a passive kind. This is seen as follows. 
From condition (1.2) we get: 


cr 


(27) 8x,1,(2) = C™ (G5), 2@—9 = 0 forall rs. 
Accordingly, the part of the internal process x(t), which is due to a 
feedback does not react in any way at the moment ¢ =z to a small 
disturbance at z—0. We say that in this case the internal process x(r) 
is passively self-generating at the moment z, and that there exists a 
latent dialectical contradiction in the whole S at the moment z. 

The existence of case (1.2) shows that from the vanishing of all the 
eigen values it does not follow that there is no circuit of feedback, 
and no self-generation, in the system S. On the other hand, if there 
is no circuit of feedback, then the eigen values are necessarily zero, 
and we have case (1). 

Cases (2), (3), and (4) on p. 156 thus all correspond to cases where 
at least one circuit of feedback exists in the whole S. In all these cases 
we call the internal process x(t) of the system S actively self-generating 
at the moment t = z. We also say that there is, in all these cases, an 
active dialectical contradiction within the whole S at the moment ¢ = z. 

In case (2) we say that the goal or the direction x is acceptable for 
the whole S at the moment t = z. This means the following. If the 
whole S performs the internal process x(t), and if there is a small disturb- 


157 


CHAPTER III 


ance of this process during the interval K7_,, the whole S has at 
the moment ¢ = z the tendency to eliminate the effects of the disturbance, 
and go back to the process x(r). 

In case (3) we say that the goal x is indifferent, or neutral for the 
whole S at the moment t = z. This means the following. If S performs 
the internal process x(t), and if there is a small disturbance of this 
process during the interval Ki_,, the system S has at the moment 
t = z no tendency to eliminate or to increase the effects of the disturb- 
ance so influenced on its internal process. 

In case (4) we say that the goal x is non-acceptable for the whole S 
at the moment ¢ = z. This means: if S is performing the internal process 
x(t), and if there is a small disturbance of this process during the interval 
Kz_,, the whole S has at the moment f = z the tendency to increase 
the effects of the disturbance so that its internal process will deviate 
from the process x(t). Accordingly, the internal process, or the action 
of S is diverging from x at the moment ¢ = z. 

In the latter case it may happen that the whole S will find at some 
later moment an acceptable goal (either x which has became acceptable 
meanwhile, or a new goal x # x), On the other hand, it may happen 
that the goal x is forced upon the system S by a larger whole of which 
S is a part. In such a case the internal process of the whole S cannot 
diverge from x(t) but this goal must be made acceptable by a structural 
change in S, that is, by a change C—>C of the coupling, so that the 
new eigen values of the matrix aR/ax = CaT/ax will satisfy condition 
(2) with respect to the process x(t). If the goal x is not acceptable to 
the whole S at the moment z, and if no alternative goal exists for S, 
we can Say that there is an antagonistic contradiction within the system 
S at the moment t = z. Such a contradiction will in due time necessarily 
lead to a qualitative change C —> C in the system S. 

In other cases the dialectical contradiction is mon-antagonistic. The 
internal process of the system S is in case (2) called self-steering or 
ergodic, in case (3) stationary, and in case (4) cumulative, or anti-ergodic 
process at the moment z. Thus we have arrived at the systematics of the 
internal process shown in Fig. 25. 


Fig. 24. A two—component feedback system 


158 


CHAPTER III 


Oscar Lange?® called the feedback couplings which make the matrix 
(aR/ax), satisfy condition (2) of self-steering the compensative feedback 
couplings. 

Exercise. Consider, by the above formalism, the simple system of 
Fig. 24. Let the coupling coefficients be c,. = C2, = 1. Show that the 
eigen values of (@R/ax). are given by 


[eT,\ (aT (2) (2) (22 
Ay = + a a — ~s : 
0X,), \@Xe), (= & : 
These are real, if the derivatives have the same sign. In such a case the 
feedback is called positive. The eigen values are imaginary, if the de- 


rivatives have the opposite signs. The feedback is then called negative. 
The condition of self-steering at the moment z is 


oT, eT. 
ox,). OXo 


Show that the condition of self-steering can in this case also be expressed 
as the condition of the negative-definiteness of the matriix 


19. Oskar Lange, Wholes and Parts, London 1965. It may be appropriate here to 
make a note of the differences the reader will find between the exposition given 
here in § 1 and § 2, on the one hand, and the exposition of these matters given 
by Lange. The innovations of my exposition, compared with Lange’s can be 
listed as follows: 1) the distinction between the trend function (f, or F,) and the 
corresponding difference process Ax; this is important, because it indicates that 
2) the properties of the matrix (@R/ax), determine only the local properties of 
the process X(t) in a future neighbourhood K2*® of the moment z; this in turn 
3) leads to a systematics and definitions of self-generating processes (see Fig. 25) 
differing from those given by Lange; it also indicates that 4) the properties of the 
derivatives (8"R/8x")., n= 1,2,.., taken at the point z, determine only the 
local properties of a ‘large’ difference process Ax(t) (see § 2.2.), i.e. the properties 
in a future neighbourhood K ot of z. One can also mention 5) the replacemant 
of Lange’s formalism of quasi-diagonal matrices X and Y for the representation 
of internal processes, by a formalism using solely the vectors x and y; this allows 
6) a separation of the roles of the matrices C and 87/dx as determinants of the 
local properties of the internal process; this in turn makes possible 7) the distinction 
between the non-existence of feedback couplings (case 1.1) and the non-existence 
of active self-generation or, in other words, the distinction between the mechanical 
reactions and the passively self-generating processes. Finally, 8) the coupling 
coefficients c may have here any real values, while in Lange’s exposition their 
values were restricted to O and 1. 


159 


CHAPTER III 


internal process x 


—— 


no active dialectical active dialectical 
controversy at the controversy at the 
moment z moment z= 
all eigen values 2 at least one cigen valuc 
zero at z: Re ae 
case (1) { 
passively actively 
self-generating sclf-generating 
at z at z 


x 


mechanical reaction dialectical 
process 


accepting neutral to not accepting 
the goal x the goal x the goal x 
at z: at z: at 2: 
case (2) case (3) case (4) 
self-steering stationary antiergodic 
(ergodic) atz atc 
at z PA | 
acceptable no acceptable 
goal goals 
possible possible 
non-antagonistic antagonistic 
controversy controversy 
at z atz 


Fig. 25. The systematics of the internal process 


CHAPTER III 


3. Solution for Gradual Effect. The results do not change when we 
go to the cases of gradual effect. We shall here point only to some 
differences in comparison with the case of sudden effect. 

For a gradual effect we have a matrix aR/ax for each value of the 
magnitude 7: 


eR (5) - _ éR(s) 
ax J, jex(f—) he: 


The defining equations of the trend function f, then become: 


(28) i on 
I(t) = | ( | f.(t—7) d= (continuous effect). 
pe ae 


fa) SS (' f@—-) (step-wise effect), 


The substitution /.(1) = w2/? now leads to the characteristic equa- 
tions 


Q «R(s) —-/9 2 
> ee Laces I|=0 (step-wise effect), 
(29) 4) 
éR(s) —t1/0 ' 
ie ? d-—/|=0 (continuous effect). 
3 C. 


2 / The Study of Internal Contradictions ’In the Large’: 
Purposiveness and Ergodicity 


Let us now study a difference process Ax(t) = x(t)—x(t) "in the large”. 
In other words, we shall again assume that x(t) and x(1) are two solutions 
of the Inner Law of Motion, (II) on p. 148, but now the initial seg- 
ment (x—X)|Ki_, is not composed of small quantities of the first 
degree. 

Assuming the necessary conditions of analyticity and convergence 
are fulfilled, we can again use the Taylor series (12) for the difference 
process. However, we must now take into consideration the higher 
terms too. Accordingly, to express the course of the difference process 
in some future neighbourhood K2*t* of the time point tf = z, we must 
now replace the equations (18) on p. 153 by the equations 


11 — Cybernetic method... 161 


CHAPTER III 


aR l , (aR 
Ax(t) = (Fe) axc—9 + rT Ax(t — 0) (Fz) axe—9 + 


+... for te Kit® (sudden effect), 
o (aR l *R 
Ax(t) = > ( aa Ax(t—7) + = rT : Ax(t— >=)’ (‘ pi = ‘ 
(30) aes wee 
Ax(t—7) +... for te K27* (step-wise gradual effect), 


Ax(t F (@R(=) Ax(t—7)d= 4 ae {— 7)’ ae 
mia | Fg) seemads +5 fama? (Fe). 
Ax(t—7z) dz +... for re K2** (continuous gradual effect). 
We can now introduce the trend function “in the large” defined by 
aR R 
Fa) = (5) Fe + 9) F' C—9 (Se) U9 


.forallt >z (sudden effect), 


OR 2R 
F(t) -3( ~ Fas Pa (t—~>) i =) 
(31) " : 
F,(t—1)+... forallt>z (step-wise gradual effect), 


0 R : 0 a2R 
ro | OR ? F,(t—1) dt + 5, JF (1 —7) (" aoe ) 


F,(t—1)dt+... forallt>z (continuous gradual effect), 


and by the initial condition that 
(32) F,|Kig = Ax|Kio. 

Due to the equations (30) and (31) and to the initial condition (32), 
the functions F, and Ax coincide in the nearest future after the moment 
z, viz. for te Kit*: 

F(t) = Ax(t) for te K2**. 
However, due to the differences of the derivatives in the equations 
12), p. 150, on the one hand, and the equations (31), on the other, viz. 


aR aR a*R a?R 
ap PS Oe ee OS: Oretes 
ax], ax], ax? }, ax? }, 


162 


CHAPTER III 


the functions F, and Ax may differ from one another for the values 
f>z+e. 

Since we have assumed the convergence of the series in (31), the funct- 
ion F, can be approximated in successive approximations by the funct- 
ions F“), F, F® etc., the function F being defined by the poly- 
nomial of the degree n including only the first n derivatives of 
R in the equations (31), and satisfying the respective initial condition. 

If all the values of the initial segment (x—X)|Ki are small 
quantities of the first order, the function F, is reduced to the function f;: 


(33) FA(t)— f(t) when Ax|Ki,— 0. 


Thus the trend function F, can be considered as an extension of the 
trend function f, for non-infinitesimal, finite values of Ax|Ki_o. 

We can use the function F, to indicate the trend of the internal 
process x(/) in all cases in which this process is self-generating, whether 
actively or passively. Obviously, the behaviour of the trend function 
F, depends 


1° on all the derivatives (2"R/ax"), (sudden effect), or 
(2"R(=)/eX"). (gradual effect), for n = 1,2,3,..., and 
2° on the initial segment Ax|Ki_9, representing the disturbance of 
the internal process. 
We have the following three possible cases as to the asymptotic beha- 
viour of the function F,. 


First, it may happen that F,(t) tends to zero when ¢ goes to infinity. 
We say then that the initial segment Ax|Ki_» belongs to the domain 
of ergodicity of the process x(t) at the moment z. The whole S has then, 
at the moment ¢ = z, a tendency to eliminate the effects of the disturb- 
ance Ax|Ki. Obviously, it follows from the respective definitions 
that if the process x(r) is self-steering at the moment z, then it has a 
certain domain of ergodicity at this moment. 

Secondly, it may happen that F,(t) approaches a constant, or an 
oscillation function of ¢, when ¢ goes to infinity. We say then that the 
initial segment Ax|Ki.4 belongs to the domain of stationarity of 
the process x(t) at the moment z. In this case the whole S has, at the 
moment z, no tendency to eliminate or to increase the effects of the 
disturbance Ax|K 7_,. Obviously, if the process X(¢) is stationary 
at the moment z, it has a domain of stationarity at this moment, such 
that the zero disturbance 4x|Ki_) = 0 belongs to it. If, on the other 
hand, the process x(t) is self-steering at the moment z, it may have 


163 


CHAPTER III 


a domain of stationarity at the boundary of the domain of ergodicity. 
In the latter case the domain of stationarity does not include the 
zero disturbance which belongs to the domain of ergodicity in this 
case. 

Thirdly, F(t) may go to infinity when ¢ increases sufficiently. Then 
we say that the initial segment Ax|KzZ, belongs to the domain of 
anti-ergodicity of the process x(t) at the moment z. The whole S then 
has a tendency to increase the effects of the disturbance Ax| Ki». 
If the process x(t) is cumulative at the moment z, it has at this moment 
a domain of anti-ergodicity which includes the zero disturbance. If 
the process x(t) is self-steering or stationary at the moment z, it may 
still have a domain of anti-ergodicity composed of large enough disturb- 
ances Ax| Ki». 

Obviously any disturbance Ax|K7_, of a self-generating process 
x(t) belongs either to the domain of ergodicity, or to the domain of 
stationarity, or to the domain of anti-ergodicity of the process x(r) 
at every moment z of the calendar K of the system S. We are now 
interested only in the domains of ergodicity of a self-steering process, 
and shall have a glance at them in the following. 

The domain of ergodicity is obviously a characteristic of a self- 
generating process x(t) occurring in a whole S, and it is associated 
with a certain moment z of time in the calendar K of the system S. 
For different wholes S, or for different self-generating internal processes 
x(t) of the same whole S, or for different moments of time during the 
course of the same process x(t), there are different domains of ergodicity. 
Thus we can denote a definite domain of ergodicity by the symbol 
Dy ,(z). 

It follows from the definition of ergodicity that the domain D, 5) 
is a subset of the vector function space F(Kz_,,R”) composed of all 
the possible disturbances Ax|K7_: 


(34) Dz (2) © F(KigR") = {Ox| Kio}. 


Here R” is the m-fold cartesian product of the set R of the real numbers 
with itself, and R” represents the range of values of the real, m-com- 
ponent vectors Ax(t). The domain D, (z) of a self-steering process x 
contains the zero disturbance. 

We need a measure of the largeness of the domain of ergodicity. 
The construction of such a measure is a simple thing, if S is a digital 
system. Then the time interval K7_, is a sequence of a finite number 


164 


CHAPTER III 


of moments of times represented by successive integers, 9 being an 


integer too: 
Ki, = {z—v; v = 1,2,..,0} (digital system). 


An arbitrary disturbance Ax|K?_, is then represented by an mx 6 
matrix A: 


mi m2: 


Here A,, = Ax,(z—v). 
The matrices A span an (m6)-dimensional real vector space, V,,9, 
where the norm square of each vector A is given by 


[Ale Stra A= aA es D Ae 
i,v 


We call |A| the norm of the disturbance 4. Each domain D, , (z) of 
ergodicity of a digital system S is a finite domain in the vector spa- 
ce V,,5, containing the origin (see Fig. 26). A measure of the 
‘largeness” of D, (z) would literally be the volume Vp of its domain 
in V,.9. However, we are more interested in the average magnitude 
of the disturbances belonging to D, .(z). Accordingly, we define the 
largeness, or the width of the domain of ergodicity as the average value 


of the norm square |A|?: 


(35) w(D, .(z)) = | Jafeaa.. 3m / | dA,,. .dA,,. 
VD 


VD 


me 


11 
Fig. 26. The domain D of ergodicity for a digital system. 


165 


CHAPTER III 


For a general system S each component Ax; of each disturbance 
A = Ax|Ki, is a function of time defined in the interval (z—0, 6) 
of the real axis representing the time. The norm square of a disturbance 
then becomes 


6 
lat = > Joxte—nav, 
i=l 


If £ = {o} is the index set of the set {Ax|Kz_,} of all the m-compo- 
nent vector functions from Ki, to R”, we have to define a measure 
uw in &, that is: a function u from F(*) to the set of non-negative real 


numbers, so that p ( U =] = > u(,) for any sequence of disjoint 
k=1 k=! 


sets Z,, L,,..¢ F(Z), and u(®)=0 for the empty set ®. Then we can 
define the width of the domain of ergodicity by 


(36) w(D, ,(z)) = i | A,|?d u(<) / | du(<). 

Ds. Dy. 
Here Ds is the subset of = composed of the indeces of the disturbances 
belonging to D, .(z): 


Ds; = {o; Age D,, ,(z)} cd. 


Another characteristic which is interesting in connection with the 
ergodicity of a self-steering process is the strength of ergodicity. We 
define the strength of ergodicity of a self-steering process x(t) in a system 
S at a moment z by 


(37) E, (2) = sa 


t 


t=z 


Thus the strength of ergodicity indicates the speed at which the whole 
S begins the elimination of the effects of a small disturbance Ax|K 6 
at the moment t = z. 

Substituting into the formula of E, .(z) the general solution (24) 
for the function f,(t) in the case of sudden effect, we get: 


Ey s(2) = P wkjz—' (1— Box,) rj'° cos (“= 2) ae 


l Zz 
re >, wz"Ir#? log r, cos (“2- 2) — 
J 


166 


CHAPTER III 


] , : @,Z 
Q@ J ) 


z 
2 aes i {41a cos (“7 - «] + 


2 z log r.cos (“2 sin (“2 «| 
SPP ; + —o9,} — —zw ei a 
7 ae 0 d 0. 7 0 4 


—» 0 when all r — 0. 


>» 


Here the “Kronecker’s delta symbol’ Sox, is defined as having the 
value I, if k; = 0, and being otherwise zero. Thus the first term in the 
parentheses vanishes if all the roots 7; are single roots. In such a case 
we get the further result that 


E. (=) —> 0 when all 2;—> +1 (for non-multiple roots). 


’ 
Indeed, if all 4)» +1, then all log r, and all , approach zero, and so 
approach the second and the third term in the parentheses. (The limiting 
case that all 4, are +1, however, represents itself as an m-fold multiple 
root, and must thus be excluded, since the first term does not then 
vanish.) 

The results related with ergodicity, similar to those above, were 
given the following interpretation by Oskar Lange?®. The cybernetic 
notion of ergodicity gives a possible causal explanation for the purposive 
behaviour. Indced, it is an essential characteristic of purposive behaviour 
that the purposive being, when disturbed, returns back to follow his 
goal, as if directed by an inner force called his ’own will’. In reality 
this direction is determined by the internal structure of the system in 
question, i.e. by the coupling matrix C and by the matrix a7/ax. When 
the eigen values of (6R/ax), = C(e@T/ax), all approach zero, the elimina- 
tion of disturbance is slow, as is shown by the above formula for 
E,,, 2). It might happen that even the width of the domain of ergodicity 
is small then, so that small disturbances already would suffice to direct 
the system’s behaviour from its earlier goal x. Both of these situations 


20. Oskar Lange, Wholes and Parts, London 1965. Lange gave no definition 
for the width or the largeness of the domain of ergodicity, and he did not calculate 
a correct value for strength or speed E> .(z) (cf. ibid., p. 66), thus concluding 
erroneously that E -> 0 when all |4y| > 1. However, these minor inexactitudes do 
not influence the main results of Lange related with ergodicity. For other differences 
between Lange’s work and the present theory see the footnote on p. 159. 


167 


CHAPTER Ill 


are characteristic of a young living being, whose purposive bahaviour 
is still uncertain. On the other hand, the ’death’ of the system means 
its permanent transformation to a decaying, anti-ergodic state for 
which |4|> 1. Obviously, there is a transitory period characterized by 
[4] = 1. For an ’ageing’ system we could expect the domain of ergodicity 
to be small again: the strength of the purposive behaviour of an ’ageing’ 
system is decreased. On natural selection in the phylogenetic devclop- 
ment Lange says: ”only those ergodic processes of development (and 
the corresponding systems) remain which are resistant to a high degree 
against disturbances, i.e. processes with a large domain of ergodicity 
and speedy disappearance of disturbances.” (Lange, ibid., p. 68). 


3 § On Future Development and Open 
Problems of Cybernetic Theory 


1 / The Need for a Theory Extending Over the Successive Phases 
of Self-Generating Dialectical Process: The Problem of 
Complication 


A limitation of the present cybernetic theory of self-generation is striking: 
this theory, presented above in §1 and §2, is restricted to a momentary 
analysis. The present theory tells only in which phase the self-generating 
process is at a certain moment of time, and within a certain interval 
of time around a given point of time. 

Accordingly, the existing theory tells whether the internal process of 
a given system is, at a certain moment, ergodic, or anti-ergodic, or 
stationary, or mechanical. If the internal process is in an ergodic phase 
this means that the system in question is in a process of purposive, 
gradual development toward a certain goal. If the process is in an 
anti-ergodic phase this means that the system is either changing its 
goal or, if no transition to an ergodic process is possible, developing 
toward an open antagonistic contradiction as a consequence of which 
the structure (and the process) of the system will undergo a qualitative 
change in the near future. 
Thus we can only say, on the basis of the present theory and if a complete 
information is available, whether a cybernetic whole whose structure 
is known is undergoing an evolutionary or a revolutionary phase of 
its internal, dialectical process at a given moment. The existing theory 


168 


CHAPTER III 


does not tell how the evolutionary and revolutionary, gradual and 
qualitative changes will follow each other in the course of the dialectical 
process. We lack a cybernetic theory extending over the successive 
phases of the self-generating dialectical process. Or, to use the classical 
terminology of dialectical materialism, we still lack a mathematical 
theory which could tell how the accumulation of certain ’quantities’ 
will produce new qualities (transformation of quantities into qualities’). 

The problem mentioned seems closely connected with the problem 
of complication. With increasing structural and functional complexity 
cybernetic systems seem to acquire qualitatively new properties, which 
the simpler systems do not possess. For instance, below a certain level 
of complication, as is shown in the theory of self-reproduction (See 
Appendix of Volume IJ), material systems are able to produce only 
such other systems which are less complicated than they themselves 
are (degenerative evolution). However, beginning with a certain level 
of complication, self-generation, and even an evolution creating com- 
pletely new qualities, becomes possible. With such phenomena in mind 
John von Neumann was inclined to connect the problem of complica- 
tion with the problem of the creation of new qualities: 

“The discussions so far have shown that high complexity plays an 
important role in any theoretical effort relating to automata, and that 
this concept, in spite of its prima facie quantitative character, may 
in fact stand for something qualitative — for a matter of principle.” 
(J.v. Neumann, The general and logical theory of automata, p. 25, 
in L.A. Jeffress (Ed.), Cerebral Mechanisms in Behaviour, Hafner Co 
1967). 

It is obvious that the theory of self-generation must proceed to a 
theory in which the self-generated increased complexity of the develop- 
ing systems produces new qualities in the course of increasing complica- 
tion. First then we can consider the succession of gradual and qualitative 
changes of the dialectical process by means of cybernetic theory. 


2 / The Need For Spatial Localization of Cybernetic Systems: 
Cellular or Tessellation Models 


In the above, in §2.5, we have explicitly introduced time to the 
description of cybernetic systems. Of course not only the time coordinate 
but also the spatial coordinates, indicating the position of the system 
and of its parts in space, should be explicitly introduced in order to 
describe completely the characteristics of cybernetic systems as material 


169 


CHAPTER Ill 


beings existing in space and time. This will certainly be a line of develop- 
ment in future cybernetic theory. 

The necessity of an explicit spatial description of cybernetic systems 
has been indicated by some erroneous conclusions based on a system’s 
description where the explicit spatial description is missing. As emphasi- 
zed by the criticism of the so called Rosen paradox, for instance, the 
neglect of the spatial properties of cybernetic systems may lead to an 
apparent paradox in the notion of self-reproducing systems”. 

The examples now existing of the explicit introduction of space 
coordinates in cybernetic theory are called cellular or tessellation models. 
We shall meet an example of these models in Appendix of Volume II. 
In these models physical space is divided into cells, each of them having 
a certain number of possible states (in a model of homogencous space 
each cell has, of course, the same set of possible states). A cybernetic 
system — for instance, an automaton — can now be described as a 
spatial distribution of certain kinds of states over the cells. 

It is possible that space coordinates must be introduced in future 
cybernetic theory not only in the context of certain particular type 
of problems, like self-reproduction, but in general. This would mean 
a mathematical metamorphosis of cybernetic theory, starting, for 
instance, with the introduction of a cellular space as a first approxima- 
tion to a spatial localization of cybernetic systems. The von Neumann 
construction of self-reproducing automata in fact suggests that the 
problem of spatial localization and the problem of complexity are two 
aspects of one and the same promlem, to be solved in future topological 
cybernetics. 


3 / The Need for Realistic Probabilism: Thermodynamic models 
and Error Theory 


The views of the philosophical role of probability in science differ 
widely depending on whether you consider it from the point of view 
of idealism or from the point of view of materialism. Therefore, we 
must carefully distinguish between two philosophies of probability in 
science, the first of which is incompatible with dialectical materialism, 
and with any sound scientific theory, while the second is what is actually 
likely to be realized in cybernetics as well as in many other branches 


21. See, e.g., L. Lofgren, Kinematic and tessellation models of self-repair, Biological 
Prototypes and Synthetic Systems, Vol. 1, New York 1962, p. 362. 


170 


CHAPTER Ill 


of science. These two philosophies of probabilism in science could be 
called idealistic probabilism and realistic probabilism, respectively. 

Idealistic probabilism suggests the replacement of the deterministic 
chains of cause and effect by probabilistic notions everywhere in scientific 
theory, as a matter of principle. Idealistic probabilism thus denies the 
existence of causal determinism. 

Realistic probabilism holds to causal determinism as a fundamental 
principle of science. It considers the introduction of probabilities just 
as a realistic device, to take into account our actual ignorance, our 
inadequate knowledge of some factors influencing the phenomena 
we are studying. 

Idealistic probabilism is one of the many forms that the positivistic 
pursuit has taken in the bourgeois philosophy of science in our day. 
Encouraged by the introduction of probabilities into quantum theory 
it is often stated that causal determinism, such as 1s involved in scientific 
materialism. is "outdated’. Quantum theory has proved, one says, that 
scicnce cannot tell the ultimate truth about the existing world, that the 
chains of cause and effect cannot explain the phenomena of our world. 
So religion is necessary to fill the gap, after all. 

This 1s a wishful interpretation of quantum theory. In reality there 1s 
nothing in quantum theory which would deny causal determinism as 
a fundamental principle of science. Probabilism in quantum theory is 
due to certain restrictions we meet when performing measurements in 
the very small dimensions of the atomic world (expressed in the so-called 
relation of uncertainty). These restrictions are taken into account in 
a realistic physical theory by introducing probabilities. Thus it is realistic 
probabilism, and not idealistic probabilism, which is inherent in modern 
quantum theory. 

The situation is the same in cybernetics. We can never accept idealistic 
probabilism in the context of cybernetic theory. Thus the basic chains 
of cause and effect, which form the foundations of cybernetic theory 
in the form of input-output relations, will never be ’outdated’ as the 
essential fundament of cybernetic theory. On the other hand, realistic 
probabilism is very likely to appear in future cybernetic theory too. 

Von Neumann’s ideas on future cybernetic theory here deserve 
particular attention. Von Neumann remarked that in many realistic 
situations where we have to apply cybernetic theory we do not know 
exactly which input comes to the system at which time. Our ignorance 
is best expressed by saying that we can only tell something of the prob- 
ability p(x) of a certain input x to appear (at a certain time ¢, or on 


171 


CHAPTER III 


average over an interval of time, for instance). In fact we have followed 
such a theoretical scheme all the time when discussing the input organiza- 
tion (and the output organization, for that matter) on pp. 125—132 
above. This approach can be developed by developing a systematical 
probabilistic description of the environment of the system in the form 
of a statistical theory. So we come to a Statistical theory of information 
which, due to its obvious connections with entropy and other thermo- 
dynamic concepts, is likely to be a theory of the thermodynamical type 
(cf von Neumann, Theory of Self-Reproducing Automata, \\linois and 
London 1966 p. 62—63). von Neumann says in his prophecy: *The 
statistical variables of the automaton’s mileu will, of course, be somewhat 
more involved than the standard thermodynamical variable of tempera- 
ture, but they will probably be similar in character. — — I will not 
go into the details of this, but I would like to emphasize that this thermo- 
dynamical link is probably quite a close one.” 

The description of the environment of the system being ’probabilized’ 
by means of a systematical, thermodynamical type of theory, the probab- 
ilities p(x) of the inputs x so defined will of course creep into the descrip- 
tion of the system too. If the inputs are determined up to a certain 
probability, so are the outputs and the inner states of the system. 
However, all the other probabilities in such a theory are strict consequ- 
ences of the input probabilities p(x), obtained when the latter are intro- 
duced to the strictly deterministic input-output and state transition 
formulae. Thus the ’thermodynamization’ of the description of the 
environment does not influence the strictly deterministic discussion 
of the functioning of the system in any way. We only have a strictly 
deterministic system located in a probabilistic environment, as a conse- 
quence of which the output of the system also appears to be statistically 
distributed. 

One further step toward a realistic probabilism in cybernetic theory 
would be the introduction of cybernetic error theory into the description 
of the actual functioning of the components. Such a step could be 
motivated as follows. Usually the components we consider in a cybernetic 
system are fairly large material objects. The correct functioning of 
such a component, described by the action operator 7 of this comp- 
onent, is a consequence of certain cause-and-effect links in the inner 
structure of the component which we do not explicitly consider in the 
theory. Thus many factors, like common wear and tear, which are not 
explicitly taken into account in the theory, can still effect the operator T. 
In other words, there is the possibility that the correct functioning of 


172 


CHAPTER III 


the component ceases, and the operator T suddenly is no more what it 
used to be. Such a possibility can be indicated by introducing a certain 
life time for each component or, what amounts to the same, by introduc- 
ing certain probabilities by which the action operator 7, when applied 
to a fixed input x, will give different values of y. So, for instance, if 
T(x) = Yo indicates the correct functioning of the component in question, 
we could postulate that 7(x) may have the value y, by a certain probab- 
ility, say 0.95. The remaining part 0.05 of the probability mass could 
be distributed over possible values y # y9, each of which indicates 
a certain type of »alfunction of the component. 

One must again emphasize that neither the suggested first, thermo- 
dynamic step nor the second, error-theoretical step of probabilism 
will in any way change the foundation of cybernetic theory on strictly 
causal determinism. There will for ever be the strictly causal recursion 
of the output of the system to the inner structure and the input in cyber- 
bernetic theory. The probabilities we introduce into the theory will 
have no significance in principle: they only express our actual ignorance 
of this or that part of the causal process we are studying. Thus both 
suggested steps belong to the realm of realistic, not idealistic, probabilism. 


4/ One Further Need in Future Cybernetic Theory: A Theory 
of Sensitive Systems 


One important further need in future cybernetic theory has been in 
the existing theory so little appreciated that we must restrict ourselves 
just to mention it here. It is the need for a theory of sensitive systems. 
By a sensitive system we mean a cybernetic system (or a whole), where 
the coupling parameters c as well as the action operators 7 may change 
in the course of the internal process of the system. The cortex of human 
brain obviously is an example of such a sensitive system, and a cybernetic 
theory of such systems would have much application in social science 
too. But the theory of sensitive system may be closely connected with 
the theory of qualitative change, and thus with the cybernetic theory 
of complication (see above, p. 169). 


5 / What Will Be Preserved of Present Cybernetic Theory? 


Whatever innovations there will appear in future cybernetic theory 
the solid materialistic foundation of cybernetics is there, for sure, to 
stay. What does this imply? 


173 


CHAPTER III 


It implies 


(1) preservation of the nature of cybernetic systems as material 
beings localized in space and time: this is expressed by the term M, in 
the definition of cybernetic system, 


(2) presevation of the fundamental causal recursion in cybernetic 
systems: this 1s expressed by the causal relation R, expressing the re- 
cursion of the output of the system to its input and inner structure. 


Thus the materialistic foundation of cybernetic theory can be expressed 
by the representation of cybernetic system as a combination S = (M,,R,) 
where R, = Ch M,. The essentials of this foundation, which were 
studied in more detail in Chapter II above, will hardly be subjected 
to any principal alterations in any future theory though an axioma- 
tization may be developed. 

If the positivistic philosophy of science is still able to hold its position 
in the Western world for some time, there will of course be idealistic 
interpretations of cybernetics too, just as there are at the present. The 
idealistic interpretations of cybernetics will emphasize, just as they 
emphasize at present, the purely formal nature of the objects of cyber- 
netics. Idealistic interpretations will also seek after possibilities, as 
they are seeking at the present, to deny the fundamental causal recursion 
dominating cybernetic theory. However, these _ linguistic-idcalistic 
interpretations of cybernetic theory will hardly have influence on the 
actual progress of cybernetics. 

The present cybernetic theory is not only any materialistic but Is 
opposed to mechanistic materialism. This fundamental characteristic 
of cybernetic theory can be expected to be preserved in future theo- 
ry as well. This means 


(3) preservation and further development of the nature of fundamental 
cybernetic theory as a theory of self-generation of material beings actively 
changing themselves and their environment in a self-generating dialectical 
process, and 


(4) preservation of the typical methodological approach of cybernetics, 
viz. proceeding from wholes to parts. 


One could say that both of these characteristics will not only be 
preserved but become more pronounced in future cybernetic theory. 
In fact we have seen that there are considerable defects in the present 
theory of self-generation: at present, we cannot yet follow the develop- 


174 


CHAPTER III 


ment of self-generating dialectical processes over successive phases in 
mathematical theory, but are so far confined to a momentary phase 
of development. In future theory this restriction should be overthrown. 
One should eventually develop a cybernetic theory where the very funda- 
mental laws of the self-generating systems are self-generating themselves: 
such a progress from self-generating systems to self-generating funda- 
mental laws is possibly connected with the problem of complication in 
cybernetic theory. 


CHAPTER IV: 


The Cybernetic Model of 
Rational Actor 


Georg Klaus, in his book on modern logic?*, gives the following 
table to indicate the systematics of general dialectics: 


General Dialectics 


Objective Theory of Subjective 
Dialectics Knowledge Dialectics 
Dialectics Dialectics Formal Dialectical 

of Nature of Society Logic Logic 


The subject of the present book in the preceding has been objective 
dialectics. However, we shall now make an excursion which extends 
also to the department of subjective dialectics entitled "Formal Logic’. 

Turing constructed a model of a ’rational actor’ capable of per- 
forming the logical steps required in correct computation. As ’computa- 
tion’, as characterized by Turing, is the prototype of all kinds of logical 
deduction, his model in fact is a model of beings capable of logical 
action. 


22. Georg Klaus, Moderne Logik, Berlin 1972. 


12 — Cybernetic method... 177 


CHAPTER IV 


Il $ The Turing Machine as 
a Cybernetic System 


1 / The Human Cognitive System 


We can approach the idea of the Turing machine best if we start 
with the feedback system of the human cognitive system. By ’cognitive 
system” we mean the total system composed of the central nervous 
system, the receptors, and the effectors. The main channels of coupling 
in this system are indicated in Fig. 27. 

The main channels of coupling are indicated by heavy lines. These 
lines show the main course of the input-output processes in the system. 
The receptors first receive inputs from the environment of the system, 
then send outputs to the central nervous system, which sends outputs 
to the effectors. The effectors finally influence the environment by 
sending outputs. This line of channels forms the simple stimulus-response 
channel of impulses. Let us remark that the inputs come to this channel 
both from the outer environment of the organism and from the environ- 
ment of the cognitive system within the organism: the receptors receive 
inputs from both environments, the external and the internal. Likewise 
the effectors influence by their outputs both the external and the internal 


environment of the body. 
Central 
Nervous System 


Effectors 


Environment 

Fig. 27. 
In addition to the simple stimulus-response channel the cognitive 
system contains important feedbacks. One of the feedback channels, 


marked by Roman numeral I in Fig. 27, may transmit to the receptors 
messages on all the actions of the effectors. The receptors obtain 


178 


CHAPTER IV 


these messages by inspecting the influences of the effectors on the en- 
vironment of the cognitive system (both external nad internal). Therefore 
feedback channel I is drawn to begin from the output channel of the 
effectors and to enter the environmental input channel of the receptors. 
The other feedback channel (II) comes to the receptors directly from 
the effectors. Through it comes the direct mechanical or chemical 
influence of the muscles and the glands on the position and on the 
physico-chemical state of the receptors (for instance, moving of the 
eyes by eye muscles or by turning the head). 


2 / Description of the Turing Machine 


Since the Turing machine?? can be understood as a simple, idealized 
model of the human cognitive system, it involves parts which correspond 
to the three main parts of the cognitive system. In the Turing machine 
a finite automaton A corresponds to the central nervous system. Device 
D, which acts both as a receptor and as an effector of the automaton 
A corresponds to the receptors and the effectors. An infinite tape, 
which is divided to successive squares, each of which contains a sign 
corresponds to the environment of the cognitive system (see Fig. 28). 


The Turing machine is also able to inspect its own output and choose 
its inputs, just like the cognitive system. These abilities are realized in 
the Turing machine in the following way: at each moment of the calendar 


23. A.M. Turing, On computable numbers with an application to the Entschei- 
dungsproblem. Proc. London Math. Soc., Ser. 2, 42, p. 230—26S. 


179 


CHAPTER IV 


of the automaton 4A, the device D inspects one of the squares of the 
tape. It first. 

— reads the sign which is in the square, then 

— writes a new sign (which may be the same as the old one) in the 
same square, and then 

— either moves to the square which is next on the right side or next 
on the left side, or remains unmoved. 

If D moves, the automation "moves its receptor”. If D remains un- 
moved in the same square it will at the next moment of calendar time 
read the sign which it wrote down one moment earlier, thus "inspecting 
its own output”. These two functions correspond to feedbacks I and 
II in the cognitive system. 

To make things clear, let us represent the calendar of the automaton 
A by the set of non-negative integers: 


(1) K = {0,1,2,. . .}. 


The machine 7, by which we mean the automaton-device system, inspects 
at each moment ¢ one of the squares on the tape. The external input 
of the machine at the time ¢ thus is the sign a which is in the square 
it is inspecting at the time t: 


(2) x(t) = a. 


The external output of the machine at the time ¢ is either the order 
*stop’ or a combination (8, m): 


(3) y(t) = ’stop’ or (b, m). 


The component b of the output represents the sign which is written 
down by the device D after it has read the sign a. The component m 
has three possible values m,, m_, and my. The value m, means 
that the device D moves to the square next on the right side, after it 
has written down the sign 5, so that D will inspect the square next on 
the right side at the time f+1. The value m_ means that D moves to 
the square next on the left side, while the value m, indicates that D 
will stay in the same square, and will inspect the same square still at 
the time t+1. The output ’stop’ means that the functioning of the 
machine stops: the machine will no longer read or write signs, or move 
along the tape. 

Since A is a finite automaton, and the device D a static automaton 
always in the same inner state, the functioning of the machine T is 
completely determined by the output function f and the state-transition 


180 


CHAPTER IV 


function g (see p. 105). These functions determine the output and the 
inner state of the machine at the time f+1 in terms of the inner state 
and the input of the machine at the time : 


y“(t+1) = f(s(t), x(t), 
s(t+1) = g(s(t), x(t). 


If we denote the state s(r) by s, and the state s(t+1) by o, we can rewrite 
this as follows: 


(4) 


(5) f(s, a) = (6, m) unless it is ’stop’, 


2(s, a) = 9. 


Thus the functioning of the Turing machine in the interval K‘t! is 
completely determined by the sequence sabmo (unless the machine 
stops). 

As a finite automaton A has only a finite number, say 7, of inner 
States: S,, 59,.. , 5,. Introducing a finite ’alphabet’ B = {ap, a,,.., a,} 
of the possible signs we have a Turing machine which has only a finite 
number of possible sequences sabmo. Writing down all these sequences. 
we get a structure description of the machine T: 


(6) sabmo;s ab moa;..... > sa’ bm" a’. 

Here cach of the s, s,....,S is one of the inner states 51, Se, . - Sy» 
and each of the signs a,a,..,a@ or b, b,.., 6° is one of the signs 
lg. Qj, 4g... ,a,. Each of the m, m’,.., m’ is one of the three possibilities 


m,,_, and mg, and each of the symbols o, o,..,o0 is again one 
of the inner states 5,, 59,.., Sp: 

Following Turings original paper, we can introduce a standard 
description of the Turing machine in the following way: for each s, 
appearing in the above sequence as one of the symbols s, s,..,5 or 
as one of the symbols o, o,.., 0 we write the letter D followed by 
the letter A, A being repeated j times. For instance, s3 would be written 
DAAA. For each a, appearing in the structure description of the machine 
as one of the symbols a, a’,.., a’, or as one of the symbols 5, b’,.. ,b’, 
we write the letter D followed by the letter C, C being repeated / times. 
Thus a,, for instance, would be written CAA. For m, we write R, 
for m_ we write L, and for my) we write N. Then we get the standard 
description of the Turing machine T as a sequence composed of the 
symbols A, C, D, L, R, N, and the symbol ”; ”. 

We can, if we like, replace the standard description by a number. 
Following Turing’s paper we can replace D by 1, C by 2, D by 3, L by 


181 


CHAPTER IV 


4, R by 5, N by 6, and ”;” by 7. Then we get the complete description 
of the Turing machine in terms of its description number. This number 
is, of course, finite. Thus we have shown that each Turing machine 
can be completely characterized by a single finite number (the Turing 
machines can thus be effectively enumerated). 


Example. Let the tape have in each of its squares the same sign a, before the 
machine starts working. And let the machine be such that it moves to the right all 
the time. Then the machine always reads do, writes the symbol a, and moves to the 
square next on the right: the machine never comes back to a square where it has 
already been. By such a simple Turing machine we can write down on the tape 
any infinite but cyclic sequence of signs we like. For instance, Iect us consider the 
machine of this type which writes down the sequence @4d9@2Q 94,0 9A20 AA. . « 
This machine thus has an alphabet B = {do,a,,a.}. The machine needs only four 
inner states s,,52,5; and sy and a complete description of the machine is given by 


(7) $1QqQ IN 4 $2} SxAoAoM 4 S33 S3QpA2M 4. $43 S4AoAoM 4S}. 


The standard description is thus given by DADDCRDAA; DAADDRDAAA; 
DAAADDCCRDAAAA; DAAAADDRDA,. The description number of this particular 
Turing machine is 


(8) 31332531173113353111731113322531111731111335317. 


3 / Computation in the Turing Machine 


The functioning of the Turing machine is called "computation’. This 
is because of the original intention of Turing to represent the machine 
as an idealization of the process of computing as it occurs in the human 
mind. For instance, the inner states of the machine correspond to the 
mental states a human being assumes when performing computation 
(Turing, ibid.). 

The Turing machine can be programmed to compute in many different 
ways. The way in which we make it compute also depends on the thing 
we want to compute. We shall consider in more detail two kinds of 
computation, viz. the computation of a number and the computation 
of a function (the values of a function) in the Turing machine. 

Turing in his original paper discussed the computation of numbers 
only. His design for the computation of a number in the Turing machine 
was the following: let the tape originally have the same sign dp in every 
square. This sign is called ’blank’. Let the other signs of the finite 
alphabet B of the machine be called either ’figures’ or ’non-figures’. 
We can always operate with only two figures, viz. the signs a, = 0 


182 


CHAPTER IV 


and a, = 1. The nonfigures a3, a,,..,a@, may be different kinds of 
symbols which we need not specify in more detail here. The total alphabet 
of our machine will thus be 


(9) B = {ay = ’blank’, a; = 0, ag = 1, ay, ay,.., G,}. 


The machine begins working at the initial time ¢ = 0. It then reads 
*blank’ from a square, and writes down a figure, either 0 or 1, in this 
square. Then we let the machine work in such a way that the tape will 
at any time ¢ > O have, when reading the tape from the initial F,-square 
to the right, a sequence of signs of the general form 


(10) FE, FE, FsE,FyEg FsEs. . -F sa9Aq QoQ. - - - 


Here each F is a figure, either 0 or 1, and each E ts either blank’ or 
*non-figure’. The number s of figures written down by the machine 
on the *F-squares” is unlimited: we have Turing machines which will 
never stop. The squares having an E-sign are called ”E-squares”. 
The sequence of figures written down by the machine before it stops, 


(11) F FoF FFs. « «Fo 


is the binary representation of the real number ”"computed” by the 
Turing machine. Of course, s is infinite only for irrational but com- 
putable real numbers. The signs on the £-squares represent signals 
needed for the arrangement of computation. They do not contribute 
anything to the final result, and they may be erased. This is Turing’s 
Original design for the computation of numbers in the Turing machine. 

Computation of functions in the Turing machine can also be designed 
in many ways. One of these is the following: we want to compute the 
values of a function F(m, Me, .., Mg), Where m, Ng, .., mg and the values 
of the function are non-negative integers. We can do this by using an 
alphabet containing only one figure, viz. the sign a, = 1. We arrange 
the original situation so that the tape has at the time ¢ = 0 the following 
sequence of signs: 


Ms evra g ddl eeeccl Og Us ek gees aes L1. . 1 agagdp. . . 
(12) ey 
n+l No+ 1 Ngtl ntl 


Beginning with the square which the machine is reading at the time 
t =O there is thus m,+1 figures 1, then a ’blank’ square, then .+1 
squares with the figure 1, then again a ’blank’ square, etc., until the 
last n,+1 figures 1 are followed by only *blank’ squares. The original 


183 


CHAPTER IV 


sequence indicated above may be written either on the successive squares, 
or on the alternate F-squares (beneath which there are the £-squares 
for operational signs as before). The construction of the machine naturally 
depends on the choice of writing the original sequence. 

One of the designs for the computation of a function in the Turing 
machine is such that the machine simply writes down on the tape, 
after the last n,+1 figures 1, first the *blank’ sign ap, and then n+] 
figures 1, there being F(7,.., 7,) = . Another design for the computa- 
tion of a function is such that one counts the number » of the figures | 
on the tape when the machine stops: this number gives the value of 
F(m,, mg, .., Mg) in this design of computation. 

Obviously computation is just another name for logical deduction. 
Wherever we have a mathematical or logical calculus, where some 
axioms and some rules of inference are given, all deduction of theorems 
from the axioms by means of the given rules is a problem of computation. 
When a Turing machine is designed so that it computes an answer, 
”yes” or ’no”, to the question whether a given theorem can be deduced 
from the given axioms by the given rules of inference, the Turing machine 
is said to be used as a decision machine. A theory may or may not have 
a decision machine. 

Computation in the Turing machine may be an infinite or a finite 
process. It is infinite, if the machine never stops once it has started the 
computation. This 1s the case, for instance, in the computation of compu- 
table but irrational number. A Turing machine, designed to perform a 
certain computation, is said to be circular if the computation is an 
infinite process. If the computation stops after a finite number of 
steps the machine is called circle-free. 


2 $ The Turing Machine as an 
Idealized Model of Rational Actor 


1 / The Turing Machine as a Model of the Optimal Organization of a 
Rational Actor 


It is evident that the Turing machine can be designed to perform many 
kinds of computations. It can even be designed to do the same computa- 
tion in many kinds of ways. In fact we have good reason to believe 
that with Turing machines one can compute everything which is com- 


184 


CHAPTER IV 


putable. This is the content of Turing’s thesis which can be formulated 
as follows: 


(1) If there is an effective procedure for the performance of some 
computation, there is a Turing machine in which this computation 
can be realized (Turing’s thesis). 


The existence of an effective procedure or an algorithm means that 
some rules are known by means of which one can perform the calcula- 
tion in successive steps in quite an unambigious way. There are finite 
and infinite algorithms. Of course each particular effective procedure is 
different from every other, and we cannot give any general definition 
which could include all the possible effective procedures that will ever 
be discovered. If we should suggest an exact definition we could never 
be sure that all the possible future kinds of calculation were included. 

Accordingly, a statement like Turing’s thesis is not, and cannot be 
a mathematical theorem which could be proved or disproved. It is 
rather a suggested ‘natural law’ which expresses a hypothesis on the 
organization of all computing systems, viz. the statement that an optimal 
computing system has to have the organization of the Turing machine. 
As soon as new kind of effective procedure for the performance of 
some computation is discovered we can verify or falsify the statement 
that this computation is realizable in a Turing machine. This has actually 
been donc, and the verification of Turing’s thesis is already rather 
convincing. For its generality and importance as an empirically verifiable 
hypothesis Turing’s thesis has been compared with the two fundamental 
laws of thermodynamics, viz. the Conservation of Energy and the 
Entropy Law?4, 

Why is Turing’s thesis so important? Because it describes the optimal 
Organization for a computing system. On the other hand, we know 
that computation in the Turing machine is equal to logical deduction. 
Thus, if we define the rational actor as a being capable of logical 
deduction, we can put Turing’s thesis in the following form: 


(1‘) The optimal organization of a rational actor is that of the 
Turing machine, that is: the organization of a whole composed 
of a digital part, viz. a finite automaton, and some not necessarily 
digital receptor and effector organs so that it is able to choose 


24. See H. Hermes, Enumerability, Decidability, Computability (Springer 1965), 
p. 18. This book can be recommended to readers interested in mathematics who 
want more information about the Turing machines. 


185 


CHAPTER IV 


its inputs and inspect its output like a Turing machine. Such a 
cybernetic whole is capable of (recursive, algorithmical) logical 
deduction provided that its environment is rich enough in logical 
possibilities so that it corresponds, as a potential input store, to 
the infinite tape of the Turing machine. 


This more expressive, though less precise form of Turing’s thesis 
tells roughly what a rational being should look like. Man, and especially 
the cognitive system of man, is an example of such a rational being. 
To make of man a rational being, in the sense of Turing’s thesis, the 
nervous system or some aspect of it must be a finite automaton. This 
is what we must take for granted. Furthermore, man can move its 
receptors, and even move himself, and thus is able to choose his input. 
This is indicated, in the rough scheme of human organization on p. 178, 
by feedback I. By feedback II man is able to inspect its output. Thus, 
obviously man is a rational being in the sense of Turing’s thesis. 

We can say more than this. The phylogenetic development has gone 
through phases which obviously reflect a tendency toward better realiza- 
tion of the organization of the Turing machine. So, for instance, animals 
in general realize this principle better than plants. Animals have developed 
specialized receptors and effectors, and also the specialized central 
automaton, to a much greater completeness than plants. Animals are 
also better able to move themselves, and to move their receptors and 
effectors than plants. In the human being the phylogenetic development 
of animals has produced a being which is capable of building a great 
variety of artificial receptors and effectors, viz. different working tools. 
In this way the ability to choose the input and to contro] the output 
has improved still further from animal to man. Thus man realizes 
the organization of the Turing machine better than any animal. Indeed, 
the well known definition of man as an animal who builds working 
tools (Benjamin Franklin), comes very close to the definition of man 
as a rational being in the sense of Turing’s thesis. 


2 / The Universal Turing Machine as a Model of an Optimal 
Rational Actor 


Another important statement on Turing machine is the following: 


(2) There are Turing machines which are capable of computing 
everything which can be computed by any Turing machine. 


186 


CHAPTER IV 


Those machines are called universal Turing machines. This statement, 
unlike Turing’s thesis, is a mathematical theorem. It was proved by 
Turing himself. I shall sketch in the following the essential steps of 
Turing’s proof for the existence of universal machines. 

We have to construct a Turing machine U which, when given any 
Turing machine 7¢ designed to perform a certain process (C) of computa- 
tion, 1s able to do the same computation (C). A necessary condition for 
this is, obviously, that one can give the machine U complete information 
on the machine 7g. This is possible since, as we know, the complete 
description of any Turing machine is a finite sequence of symbols which 
can be written on the tape of U before U begins working. The description 
of the machine 7¢ can thus be fed to the machine U. 

The trick used by Turing for the construction of U was based on the 
observation that not only the machine 7¢ itself but also the process 
of computation, (C), can be completely described by a sequence of 
symbols suitable for printing on the tape. This sequence is finite only 
for a circle-free machine 7¢, and infinite for a circular 7, which, however, 
does not matter. The first task of constructing the universal machine 
U is to give a complete description of the computation (C) for which 
a given Turing machine 7; is designed. 

We shall restrict ourselves here, as Turing did in his original, to the 
discussion of only one kind of computational design, viz. to the compu- 
tation of numbers as explained on pp. 182—183. As an example we 
shall give a simple complete description of the process of computation 
(C) of the Turing machine T¢ which writes down on the F-squares the 


infinite sequence OIOIOIOI... This is the machine whose standard 
description and description number were constructed in the example 
on p. 182. 


Denoting the blank squares by x the complete sequence written by 
To is OxlxOx1x0Oxlx... Its alphabet is B = {ay = x, a, =0 a, = 1}, 
and it has four inner states 5,,59,53, and sy. The structure description 
of the machine itself is (cf. p. 182) 


(1) S,XOIMN , Sy; SgXXM 45g; SgX1 M4545 SyXXM 45; 


It begins working in the inner state s,, the tape being completely 
blank. In its first move it scans the symbol x, writes down the symbol 0, 
moves right, and goes over to the state sp. In the second move it scans 
x, writes down x, moves right, and goes over to S53. In the third move 
it scans x, writes down 1, moves right, and goes over to s,. In the fourth 
move it scans x, writes down x, moves right, and goes over to s, back, 


187 


CHAPTER IV 


so that the working continues repeating these four moves cyclically. 
This is the process (C) we ought to describe by a sequence of symbols. 
We can choose, for instance, the following complete description of (C): 


(2) b:0xc:0xxe:0x1xk:0x1xxb:0N1x0xc:0x1x0...xe:0x1xO0x1 xk:, 
etc. 


Here we have denoted s, = b, $5 =c $3 =e, and s, = k, for con- 
venience. Each move of the computational process is described by the 
sequence of symbols between two subsequent colons. After cach colon 
we have first written the sequence of symbols which is on the tape 
after this move, beginning with the first 0, and ending with the symbol 
that the machine is scanning at the end of the move. After this last 
symbol there is the symbol of the state to which the machine is transferred 
in this move. Then comes the next colon. Each sequence of symbols 
between two subsequent colons is called a complete configuration of 
the process (C). Thus each move of the process is described by such 
a complete configuration. Since the process (C) is infinite, so is the total 
sequence of symbols describing it. 

For convenience we can introduce the standard description of the 
computation (C), just aS we introduced the standard description of the 
machine 7, on p. 181. The latter was obtained by replacing, in the 
complete description of T>, the state s, by the letter D followed by k 
letter A, the symbol a; by the letter D followed by / letters C, and the 
signs m,, m_ and mm, by the letters R, L and N, respectively. For 
the standard description of the machine 7¢ we thus obtained (cf. p. 182) 


(3) DADDCRDAA; DAADDRDAAA; DAAADDCCRDAAAA; 
DAAAADDRDA,; (S.D. of Te). 


For the standard description of the computation process, (C), we now 
obtain, by the same replacements as to the s, and a;: 


(4) DA: DCDDAA: DCDDDAAA: DCDDCCDDAAAA: 
DCDDCCDDDA: DCDDCCDDCDDAA: etc. (S.D. of (C)). 


Now that we have the standard descriptions of both the machine 
Tc and the process of computation (C), we can state precisely what we 
mean by the problem of constructing a Turing machine, say U’, capable 
of performing the same computation (C) as the machine 7¢. First we 
can set up a machine which, when given the standard description of 
Tc on the tape, can write down on the tape the successive complete 
configurations of (C). We can agree that both the S.D. of 7, and the 


188 


CHAPTER IV 


S.D. of (C) are to be located on the F-squares of the tape of U’, so that 
the E-squares are left for computational signs. For the reading or printing 
of the standard descriptions the machine U’ must have the letters A, 
C, D, R, and N, and the signs “:” and “;” in its alphabet. In order 
that the desired sequence 01010101... could be explicitly read from 
the tape of U’ we can design the machine U’ so that it writes down, 
between cach successive pair of complete configurations of (C), the 
figures which appear in the new configuration but not in the old one. 
Thus U’ would write down, instead of the S.D. of (C) given above, 
the following sequence: 


(5) DA: 0: DCDDAA: DCDDDAAA :1: DCDDCCDDAAAA: 
DCDDCCDDDA: 0: DCDDCCDDCDDAA: etc. 


Accordingly, U’ should have in its alphabet, in addition to the letters 
and signs mentioned above, and in addition to a suitable number of 
computational signs, the two figures 0 and |. Then the task of construct- 
ing a Turing machine U’, capable of performing the same computatation 
(C) as the machine 7, means the following: let initially the S.D. of 
Tc be given on some successive F-squares on the tape of the machine 
U’, together with a number of computational signs (located either on 
E- or F-squares). The problem is to construct, using a deliberate finite 
number of computational signs, such state-transition and output func- 
tions for U’ that it will write down on the successive F-squares of the 
tape the S.D. of (C) completed by the figures 0 and | of the sequence 
O1010101... as indicated above. This is a definite task to perform. 
After this has been done we have only to see to it that the S.D. of 7, 
can be exchanged for the S.D. of any other Turing machine, and we 
have constructed a universal machine U. 

Above we roughly described the outline of the proof given by Turing. 
If you want to read Turing’s original paper (mentioned in the footnote 
on p. 179) you should notice that he has two more technical tricks in 
his proof. First, when writing the complete configurations of (C), he 
combines several moves together by allowing the machine to perform 
several successive readings, printings, and moves within a single con- 
figuration (see his page 235). Secondly, he introduces some abbreviated 
notations for the writing of sequences of complete configurations. 
Turing’s final rules for the construction of a universal machine U, given 
on his pages 244—246, are written by using these abbreviated notations. 

The existence of universal Turing machines is essentially based on 
two facts. First, that every Turing machine can be completely described 


189 


CHAPTER IV 


by a finite sequence of symbols. Second, that once the machine exceeds 
a necessary minimum of structural complexity, it can be made to imitate 
the computation of any more complicated machine by compensating 
for the lack of complexity by deliberate lengthening of the process 
of computation. 

The former fact we know from pp. 181—182 where we showed that 
every Turing machine can be completely described by a finite sequence 
of symbols, for instance, by the S.D. (standard description) or by the 
D.N. (description number). The latter fact needs some consideration. 
What, to begin with, is meant by structural complexity ? 

A fairly reasonable measure of the structural complexity of a Turing 
machine T would be the length /(7) of its structure description. By 
length we mean the number of symbols in the sequence in question. 
The primitive complexity®* thus defined is greater the more there are com- 
binations of (input, state)-pairs and outputs, and of (input-state)-pairs 
and next states in the output and the state-transition functions f and g. 

Now we can order the primitive complexities of Turing machines 
according to magnitude, beginning with the smallest: 


(6) | Pe ee ne Pe ee 


Among the simplest machines there are no universal machines. But 
beginning with a certain level, say /’, of primitive complexity the first 
universal machine U, appears. It is able to imitate all the Turing machines 
whatever their level of primitive complexity, for instance, a machine 
whose primitive complexity is 10!° times /’. At first glance this seems 
paradoxical. How is the machine U, able to perform the same functions 
as the bigger and more complex machine? Because we have no upper 
limit for the time, or for the number of successive computational steps, 
after which a universal machine must have completed a certain part 
of the computation. If the bigger machine needs m successive steps of 
computation (m calendar units of time), the machine U, may be allowed 
to use say 10° times m steps of computation. Thus the complexity of 
structure is, as it were, compensated by lengthening the process of 


25. Less primitive notions of complexity can be introduced, if we represent the 
finite automaton of the Turing machine by a neural network (the possibility of 
which was proved by McCulloch and Pitts, cf. p. 192). We can, for instance, define 
the functional complexity of a network as the length of the longest period of re- 
verberation possible in the network (‘reverberation’ means the circulation of 
input/output around a circuit of feedback). This corresponds to the notion of 
*order’ of the network by McCulloch and Pitts. 


190 


CHAPTER IV 


computation. However, before this can happen we must have exceeded 
the threshold /° below which no universal machines appear. 

When we think of material systems having the general organization 
of a Turing machine, we again come at the level /’ of primitive complexity 
to the point where the first universal machine should appear. A material 
system whose structure as a Turing machine is described by the D.N. 
of the machine U, obviously is an optimal rational being which, theoretic- 
ally, is capable of any possible sequence of logical deduction. However, 
for a real material being we can never assume the length of a process 
of computation performed by it to be unlimited. No material object is 
capable of preserving its structure and its mode of action forever. For 
Instance, every material object is subjected to common wear and tear. 
Accordingly, no real material system can act as a universal Turing 
machine even though its structure might be that of a universal machine: 
it only may have the logical possibilities of acting as a universal machine. 
This is what is meant when speaking, for instance, of the human being 
as a universal Turing machine. 

The structure of every universal Turing machine U 1s, of course, 
completely determined by the D.N. of this machine. There is an infinite 
sequence of universal Turing machines, each of them completely de- 
scribed by the respective description number in the infinite sequence 
N,, No, Nz... of the description umbers of universal Turing machines. 
They represent varying levels of structural complexity, there being 
(U)= I for all U. Thus we can think of an infinite variety of real material 
systems, differing by structure and by structural complexity, but all 
representing optimal rational beings in the sense that they are all theo- 
retically capable of any possible sequence of logical deduction. 

Thus we can express the consequences of the theorem (2) for real 
material systems approximately as follows: 


(2') Above a certain level of structural complexity the real material 
beings having the organization of a Turing machine may become 
optimal rational beings in the sense that they are theoretically 
capable of performing any possible sequence of recursive logical 
deduction. Thus there is, in this sense, no theoretical upper limit 
for the intellectual capacity of these beings. There is an infinite 
variety of possible optimal rational beings, differing by structure 
and by structural complexity, their actual intellectual performance 
being restricted only by the material conditions (finite life span, 
energy consumption, available tools for work, sufficient environ- 
mental stimuli, etc.) to which they are subjected. 


191 


CHAPTER IV 


From the formulations (1'), on p. 185, and (2’) given above we under- 
stand at once the great sugnificance of Turing’s theory, originally 
formulated as a theory of computability, for subjective dialectics. 
Of course one has to keep in mind that this theory is not intended to 
be any realistic description of what actually happens in the human 
brain. Turing’s theory explains the significance of the general 
organization of the cognitive system: to be a rational being of optimal 
organization the nervous system needs the peculiar mechanism for 
controlling its environment by means of self-inspecting and movable 
receptor-effector system (a mere finite automaton without the tape 
and the feedbacks of the types J and II on p. 178 has much more restricted 
ability, as was shown by C. Kleene). Turing’s theory also tells us that 
this organization, above a certain level of structural complexity, is 
able to produce the optimal rational being in the sense that it is theo- 
retically capable of all recursive logical reasoning —- which is quite 
a lot. We can well understand that two important developments in 
cybernetic theory were directly inspired by Turing’s theory. 

First, Turing’s theory immediately raises the question, ‘what kind of 
material organization corresponds to the finite automaton of a Turing 
machine?’ Can the automaton of a Turing machine be represented for 
instance by a material organization like that of the nervous system, 
1.e. by a kind of network of neurons? If so, we could make our theoretical 
understanding of the rational aspect of man somewhat more precise. 
This line of thought was followed up successfully by McCulloch and 
Pitts in their construction of neural networks, in terms of which every 
finite automaton can be represented.”® 

Secondly, one is led to ask: can we derive from Turing’s theory a 
principle of reproduction of rational beings, similar to the actual re- 
production of living beings? This line of thought was followed up by 
J. von Neumann in his theory of self-reproducing automaton.?’ 

Let us only mention that both of the questions posed could be answered 
in the affirmative. Von Neumann was able to show that in a space 
(actually a plane) composed of adjacent cells, capable of assuming 29 
different states, localized Turing machines (even universal machines) 
could be constructed, these machines being able to construct other 
machines of similar kind, and even machines more highly developed’ 


26. W. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous 
activity. Bull. Math. Biophysics 5, pp. 115—133, 1943. 

27. J. Neumann, Theory of Self-Reproducing Automata, (edited by Arthur Burks), 
The University of Illinois Press, 1966. 


192 


CHAPTER IV 


than they themselves are. Thus the introduction of spatial localization 
widens the scope of Turing’s theory in a decisive manner, and enables 
one to consider the logical conditions of reproduction of rational beings 
in space and time. 


3 Does the Recursivity of Its Operations Make the Turing Machine 
Intellectually Inferior to the Human Brain? 


We have cmphazised that existing cybernetic theory — both in the 
domain of objective social dialectics and in subjective dialectics — has 
Important limitations. The most important one is the lack of a cyber- 
netic theory of the transformation of quantities into qualities in the 
course of a self-generating dialectical process. This fact has consequences 
in the application of existing cybernetic theory to both objective and 
subjective dialectics. In subjective dialectics it appears in the lack of 
cybernetic theory dealing adequately with dialectical logic, i.e. with 
the problems related to the developmental processes of human con- 
sclousness. 

However, nobody has proven that this would be a /imitation in principle, 
valid for ever and for all future forms of cybernetic theory. The field is 
open so far for hopes that future theory will bring a solution to this 
problem. So, for instance, the famous mathematician and prominent 
cyberneticist A.N. Kolmogorov once said, "I belong to those extremist 
cyberneticists who see no fundamental limitations to the problem of 
life in the cybernetic approach and who believe it is possible to analyse 
life in all its aspects, including human consciousness, by cybernetic 
methods” .?8 

An opposite opinion was expressed by the Finnish philosopher J. 
Hintikka, as a criticism of the Finnish predecessor of the present book. 
Hintikka sees in the recursivity of the operations of the Turing machine 
a limitation in principle of cybernetics to cope with all the aspects of 
life and human behaviour: "I have stated that the arguments gi- 
ven by A. in favour of the special position of cybernetics as a 
general method [in the mathematical methodology] of behavioral 
science in principle are erroneous, since they do not see the limi- 
tations of recursive methods”??®. 


28. A.N. Kolmogorov, Automatic machines and the life process, The Soviet Review, 
July 1962, p. 41. 

29. J. Hintikka, an official statement to the faculty of political sciences, University 
of Helsinki, 1970. Cf. with J. Hintikka, Sosiologia 4/1970 (in Finnish). 


13 — Cybernetic method... 193 


CHAPTER IV 


As the recursivity of the operations of the Turing machine is an 
important principle in the mathematical representation of conscious 
human activity, we have all reason to discuss Hintikka’s argument in 
some detail. Is the recursivity indeed a limitation in principle, which 
restricts the possibilities of cybernetics in the representation of human 
activity, as suggested by Hintikka? 

It is well known that in mathematics there arc non-computable 
numbers and non-recursive functions which cannot be computed by 
means of any existing algorithm, and thus in this sense are outside the 
scope of recursive methods. In defence of his methodological position, 
Hintikka takes recourse to this trivial fact (and to the incompleteness of 
recursive calculi cf. p. 17). However, it is well known that the mentio- 
ned fact is not a limitation of the Turing machine in comparison with 
the human being: it is the very idea of Turing’s thesis that the intellec- 
tual capacity of human beings has just the same limitation to recursivi- 
ty as the Turing machine. 

Hintikka is by no means alone when giving the mentioned over- 
simplified philosophical interpretation of the well-known results con- 
cerning recursivity and incompleteness. That interpretation has been 
ghosting all the time since the results in question were obtained 
by Géddel and Church. Orginally, it was expressed by E. Nagel 
and J. R. Newman in their book entitled ’Gddel’s Proof’. Re- 
cently, even the Russian mathematician A. A. Markov repeats the 
Same interpretation at the end of his book ’The Theory of Algo- 
rithms’. 

The philosophical situation, as I see it, has been correctly repre- 
sented, for instance, by Georg Klaus. He expresses the same idea 
by referring to the Church theorem of non-decidability: "Our episte- 
mologically important result is thus: There are — as expressed by the 
Church theorem of non-decidability — sets of problems, for which 
no algorithm exists, in terms of which all the problems of this set 
could be solved. Accordingly, sets of problems, which cannot be ge- 
nerally solved by any electronic computer however complicated. How- 
ever, this is a very weak consolation for him, who would like to see a 
difference of ability in principle between the human brain and the 
computer, as the limits of this theorem of non-decidability concern 
as well the brain as the computer”.*° 

Klaus then refers to the obvious fact that it is man who discovers 


30. G. Klaus, Moderne Logik, Berlin 1972, pp. 329—330. 


194 


CHAPTER IV 


the algorithms fed to the existing computers, which thus gives man 
a position over the machine. However, he adds, ”But now the arguments 
here presented in favour of the superiority of the human brain are not 
of principal kind. They are associated only with the machines known 
today. We know that modern cybernetics has developed conceptions, 
which make the phenomenon of adaptation and of the improvement 
of adaptation understandable, i.e. the progressive development of living 
beings. — Starting with the fact that the human brain itself is one — and 
indeed the highest — form of adaptation to reality, so there are no 
limits in principle for the construction of machines, which cannot 
only technically realize the algorithms but even find them”.*! 

In another book, Klaus returns to the problem of creativity, di- 
stinguishing between schematic and creative thinking, whereby the 
former is idealized by the algorithms realizable in a Turing machine 
and the latter by the probabilistic way of functioning of "statistical 
machines *"3?, 

However, here Klaus is obviously thinking of the ’Turing machine” 
as a machine whose initial tape information (the signs on the tape 
before the functioning of the machine) is given once for all, and would 
classify a Turing machine with a probabilistic varying tape information 
to the “statistical machines”. To avoid misunderstanding one must 
emphasize that the active agent is in both cases the same, viz. the finite 
automaton of the Turing machine. Only the initial conditions and the 
cnvironment represented by the tape are different in the different cases: 
in the case of “schematic thinking” the Turing machine performs a 
single computation programmed for it, while in the case of "creative 
thinking” the same machine is located in a probabilistic environment 
(cf. the ideas of von Neumann, p. 172) and produces varying perfor- 
mances reprogramming itself according to the varying ’needs” settled 
by the environment. 

It is important to realize that®® 

no one has ever presented any evidence to the effect that the re- 
cursivity of the performance of the Turing machine would make 
this idealized machine, as a model of the formal-logical aspects 


31. G. Klaus, ibid., p. 330. 

32. G. Klaus, Kybernetik und Erkenntnistheorie, Berlin 1972, p. 258. 

33. This fact has been emphasized more than once in cybernetic litterature even 
elsewhere. I completely agree on this issue, for instance, with M. A. Arbib, who 
in his book Brains, Machines, and Mathematics (McGraw Hill 1964) criticizes 
the well-known arguments of Nagel and Newman. 


195 


CHAPTER IV 


of rational mental activity, in principle inferior to the performance 
of the human brain, neither in schematic nor in ‘creative’ kinds 
of action. 

Hereby it is implied 

1° that the external information fed into the machine through the 
tape may be either once for all given (the *schematic’ performance) 
or even statistically variable (the ’statistical machine’), and 

2° that ’in principle’ in the above sentence means neglecting of the 
possible limitations of time, energy and material that may exist in 
reality. 

The real difference between the machine and the brain is not to be 
sought in recursivity or nonrecursivity. The real difference is, firstly, 
in the fact that the brain has no unlimited life-span, and thus no infinite 
tape: no real material system has. The brain has to compensate for 
this deficiency, as well as possible, by the complexity of the structure 
of its automaton A, which makes the brain — we may well use this 
term here — qualitatively differing from the idealized machine. Secondly, 
there are all the other differences due to the fact that every real material 
being is subjected to material conditions, due to the consumption of 
energy, which are neglected in purely cybernetic theory. Thirdly, Turing’s 
theory is lacking the theory of self-steering as yet. This means that the 
active, purposive aspect of the brain is neglected which, of course, 
is a neglection of theory only, and can be removed in future. 


196 


CHAPTER V: 


Cybernetic Logic of 
Social Development 


Modern symbolic logic has achieved much in the logical study of existing 
mathematical structures. This achievement was possible not least for 
the solid semantic foundation offered by mathematics itself for formal- 
logical study. As a consequence of this semantic foundation problems 
of consistency, completeness (of axioms) and decidability (of truth) 
in mathematical theories could be often successfully discussed. This 
is mathematical logic, understood as the logic of mathematics. 

In recent years an extension of symbolic logic is getting more and 
more attention, viz. modal logic with all its branches (including among 
other the deontic logic). Modal logic includes the formal-logical study 
of notions like necessity, possibility, "Is’ and ’Ought’, which are closely 
connected with the social reality of human beings. Therefore the semantic 
foundation of modal logic can hardly be found elsewhere but in a correct 
understanding of the factors moulding our soctal reality. In short: 
modal logic is directly concerned with social science. 

It seems to me that modal logic has been developed on the basis of 
linguistic idealism, which has neglected the very fundamental factors 
of social reality like, for instance, the appearance of goal-directed, dia- 
lectical social development. As a consequence of this neglection the 
existing modal logic has no solid semantic foundation. Nevertheless, 
modal logic as it stands is being represented as the true logical founda- 
tion of social science, thus giving to positivistic conceptions of social 
science a supposed consecration in terms of ’modern logic’. 

In the present chapter I shall present my criticism of the semantic 
foundation of existing modal logic, and study the semantic foundation 
of modal logic on the basis of developmental theory. 

We can begin by a review of cybernetic notions related to objective 
dialectics, as represented by the GDR collective of philosophers in 
the book entitled Marxistische Philosophie (Dietz Verlag, Berlin 1967). 
This book will be referred to as MPhi in the following. I think that 


197 


CHAPTER V 


their definitions related to the system-character of the world give 
a good verbal introduction even for a cybernetic approach to modal 
logic. 


1 $ The System-Character of 
the World 


1 / Universal Causal Determinism 


To make the argument explicite, it will be arranged in a tree of success- 
ive decisions, each of which is a philosophically significant choice from 
two alternatives: 


Start 
“XN 
XN 
“XN 
real systems, language (Choice 1) 
Te 
. . Sis . . 
determinism | indeterminism (Choice 2) 
TAR 
“ : 
nets of causal Nlinear causal (Choice 3) 
interaction ~_ chains 
“XN 
. ae CO 
purposive systems — non-purposive nets (Choice 4) 
Sot of interaction 
N 
“N 
e . 3 
dialectical development pore degencrative or (Choice 5) 
toward higher qualities “halted development 
in islands of relative 
autonomy 
Fig. 29. 


Choice | is one between materialism and idealism. Instead of beginning 
our logical study by listening to how words are used in the language 
(which is a customary beginning of a Western scholar of modal logic), 
we start from the logic of real material systems. What does it mean? 

”The system character of the objective reality means that the world, 
(1.e.) the nature and the society, is a whole of material systems charac- 
terized by definite structural forms and being in definite relations to 
one another” (MPhi, p. 264). By virtue of this definition we can conclude 
that the world is thought of as a class K = {Sg; o « X} of material systems 
Sg, each of which is characterized by a structural form 


198 


CHAPTER V 


(1) Str(Sg), 
there being definite relations 
(2) Rel(Sq,Sg), Rel(Sg, Sg, Sy), ete. 


between the systems Sg eK. 

“The material systems can be divided to static and dynamic ones. 
However, this division eventually is only a division originating from 
practical ends. There are no static systems in the strict sense of the 
world, they are just borderline cases of dynamic systems. This statement 
is nothing but the system theoretical formulation of the general thesis 
of dialectical materialism that, in the last analysis, all rest states of 
systems, all unchanging systems are that only apparently or temporarily 
and only approximatively” (MPhi, p. 219). Let the systems SgeK thus 
be dynamic. 

“Elements of material dynamic systems are used to be called active 
elements. They are characterized by the fact that they receive and exert 
influences... In the jargon of cybernetics the influences received by 
the clement are called its input, the influences it exerts on the others 
its output. The art and manner by which the active elements of a material 
system are coupled with one another and with the environment of the 
system forms the structure of the system” (MPhi, p. 219—220). Thus 
the material dynamic systems here meant correspond, in our terminology, 
to the cybernetic wholes which simultaneously are cybernetic systems 
(see §3, Chapter II). Let EY,..,E,y be the active elements of the 
system (or the whole) Sg. We can associate with Sg the set J(Sg) of its 
elements, and write: 


(3) (So) = {E},-.., En}; 
(4) Str(Sg) = Coupl(E7,.., Ey, So). 


Here S, means the environment of the system Sg (i.e. Sg comprehends 
all the material systems outside of S4). 

Accordingly, we specify the system Sg by the set of its active elements, 
and define the structural form Str(S,) as the mutual coupling of the 
elements E{,..,£y and the system Sg. By ’coupling’ we understand 
here, generally, a statement telling us how the outputs of the elements 

1,-.,£ and of the system Sg influence the inputs of these elements 
and the system Sg (here, of course, by the output of S, we mean the 
input of Sg, and vice versa). 


199 


CHAPTER V 


After the above specification of the systems Sg « K and their structural 
forms, the relations (2) can be specified to mean the following, using 
the self-explaining notation Sg U Sg: 


(5) Rel(Sa, Sg) = Str(Sa YU Sg), 
Rel(Sa, Sa, Sy) = Str(Sa U Sg YU Sy), ete. 


Choice 2 introduces causality as a universal principle governing all 
material world: 

”The conception that all phenomena of the material world, on the 
basis of objectively effecting and recognizable laws, are in a general 
causal interrelation with one another and are conditioned by cach other 
we call determinism. Correspondingly, we understand by ‘indeterm- 
inism” a conception which denies the objective existence of causal 
interrelations as well as of general and specific lawfulnesses within the 
distinct forms of motion of the matter” (MPhi, p. 264). 

We introduce determinism to our description of the world by requir- 
ing that, for each element E% of each material system S,, a causal 
relation 


(6) R(E%)< X% x ¥® 


is explained, telling how the input of £% determines the output of £%. 
Thus R, is a two-member relation defined as a subset of the product 
of the set X” of all input states of ES and of the set Y® of its all output 
states. The distinction between cause and effect, i.e. between the sets 
X*“ and Y” is not trivial. As we have emphasized before (cf. pp. 94—96 
and 99—103), it is not to be done on merely formal or conventional 
grounds, for instance, merely on the basis of the temporal order of 
events. Causality is a primary category in Marxist philosophy — a 
category preceding that of time (i.e. time is explained in terms of cau- 
sality and not vice versa). The distinction between cause and effect must 
be done as a result of experiences directly based on the part of social 
practice, which is relevant in each case. A definition given by Engels 
is as follows: 

”We find not only that a certain motion follows another, but we also 
find out that we can call forth a certain motion by producing the con- 
ditions under which it occurs in nature, well, that we can call forth 
motions which do not occur in nature at all (industry), not at least in 
this form, and that we can give these motions a predetermined course 
and range. Hereby, through the activity of man, the notion of causality 
is grounded, the conception that a motion is the cause of another. The 


200 


CHAPTER V 


activity of man gives the test of causality” (F. Engels, Dialektik der 
Natur). 

Once causality is, in accordance with formula (6), explained for the 
elements E% of the system Sg the coupling (4) of the elements defines 
the system Sg as a causal net. 

There are two kinds of causal nets, viz. those involving causal inter- 
action and those involving only linear causal chains. Let us write 


(7) EE — ES 

if the output of E% is able to influence the input of £%. Then, if the 
elements EY) ,..,£, can be so ordered that 

(8) E*, — E*% if, and only if i < ¥, 


the system Sg composed of the elements Ey,.., Ey is said to contain 
only linear causal chains (obviously, such systems are capable of only 
mechanical reactions in the sense of our discussion in Chapter III, 
cf. pp. 156—157). If (8) is not true, there is at least one circuit of feed- 
back in the system which, accordingly, represents causal interaction 
(cf. p. 157). This divides the class K of all material systems to two mutu- 
ally disjoint classes K,,, and K’: 

(9) K=K,,U¥K’ , Ky, 1K’ = @. 

Choice 3 means focusing our interest on the class K’ of feedback 
systems. This choice too has philosophical significance. The philo- 
sophical significance of the distinction between the classes K,,, and 
K’ was explained in MPhi as follows: 

Here we distinguish next those systems, whose elements are mutually 
so coupled that they form linear causal chains, from the systems whose 
elements are bound together by means of feedback. The structure of 
the first kind of systems is such that the system reacts to the influences 
of the environment in a linear-causal way. Such systems we find mostly 
in the inorganic nature. As its prototype can serve a mechanical system 
of mass points under the influence of external forces ... Quite another 
type of systems are obtained, if the structure of the system shows feed- 
backs between the elements. According to the kind of feedback such 
a system is able to treat the impulses it receives from the environment 
either so that (1) certain parameters of the system remain constant or 
follow in their change a definite function or so that (2) the influence 
of the environment upon the system ever more increases and at last 
enforces the system out of its domain of stability ... The systems with 
compensative feedback (the case (1)) are in a certain extent able to 


201 


CHAPTER V 


adapt themselves to definite types of disturbances, and to mould them 
so that the system does not exceed the limits of stability. This type of 
systems appears in particular in the organic nature and in the society. 
The knowledge of this type of feedback makes possible to explain in 
a natural way the goal-directed behaviour of many such systems, which 
so far ever was a domain of idealistic teleology” (MPhi, p. 227). 

Accordingly, choice 3 means opening the way for a causal explanation 
of the active, purposive behaviour of living beings, and of human beings 
and human society in particular — the way which was closed for the 
mechanistic materialism, concentrated on the linear causal chains 
which it interpreted as the fundamental manifestation of causality in 
nature. 

Choice 4 leads our interest to the sub-class K,,, of goal-directed 
{purposive) systems among all the feedback systems Kage K’ of the 
world: 


(10) K.. cK’, K..#@. 


erg erg 


Accepting a non-empty class K,,, means exceeding the barrier that in 
current bourgeois philosophy has been usually established between 
the possibilities of causal explanation and the existence of purposive 
beings (we thus oppose the views represented, for instance, by G.H. 
von Wright, Explanation and Understanding, London 1971). 

In fact, as we have seen in Chapter III, cybernetics already knows 
how to build goal-directed systems, displaying a well-defined sense of 
*purposiveness’ on a strictly causal ground. To the existence of just 
such systems was referred to in the last quotation. Such a system Sg « K,,, 
we have called a self-steering (ergodic) system. In such a system the 
process of interaction between the elements is, within the limits of a 
certain domain of stability (or rather ergodicity), independent of the 
external disturbances of the system. Thus, when disturbed such a system 
is able to eliminate the disturbance and so to impose its ’own will’, 
provided that the disturbance does not exceed the limits of the mentioned 
domain. We shall apply a simple model of causally determined pur- 
posiveness in §§ 2—3 below. 

As to the terminology we shall employ our previous terminology, 
which Is the same as that used by O. Lange. If the process of interaction 
in our system Sg e K,,, asymptotically approaches a definite function of 
time, instead of a constant, we shall speak rather of ergodicity instead 
of stability. A stable causal net obviously is a special case of the more 
interesting ergodic causal nets. 


202 


CHAPTER V 


The larger is the domain of ergodicity, the more independent the 
System is of external disturbances or, expressed in still other words, the 
greater is the relative autonomy of the system with respect to the en- 
vironment. The magnitude of relative autonomy is determined by the 
Structural form Str(Sg) of the system Sge K,,, in question, i.e. by the 
coupling of the elements. To be more specific, it is determined by the 
kind and complexity of the compensative feedbacks existing in the 
coupling net. 


2 / Development in Multi-Ergodic Systems 


Choice 5, finally, brings into our world the notion of development. 
“The material world, as it displays to us today in the most different 
domains, is a system of distinguishable stages of development of mat- 
ter. To understand and to change the world we thus have to comp- 
rehend it not only in its lawful order but also in its change and deve- 
lopment” (MPhi, p. 357). 

Choice 5 means assuming the existence of a non-empty class 


(11) Reese Rec iy. ica 20 


merg 


of systems displaying great relative autonomy with respect to their 
environments, and characterized by dialectical development toward 
ever higher ontological qualities. What are the ’qualities’, what means 
*higher’, and what is ‘dialectical development’? 

As to the qualities in the ontological sense here used: ’The material 
world is a system of qualitatively different stages of development, 
which in a rough classification can be indicated as the inorganic, organic 
and the social stage of development” (Phi, p.364). In a more specific 
classification of qualities the quality (this notion of ontological quality 
should not be confused with the notions of input and output qualities 
studied in Chapter II) appears as a predicate of every being and system: 
“The manifesting essence of beings, systems etc. we call their ’quality’. 
Otherwise expressed: the quality is the essential characteristic” (MPhi, 
p. 375). 

On the other hand: ”Every quality is quantitatively determined. 
A change of the quantitative determination of a being or of a system 
leaves, within certain limits, the quality of the being or of the system 
untouched. The measure (das Mass) gives the limit until which a given 
quality can change, without ceasing to be just this quality. By way 
of the notion of measure dialectical materialism thus orientates itself 


203 


CHAPTER V 


to the fact that every quality is quantitatively determined and the quality 
and the quantity form a unity” (MPhi, p. 379). More precisely: “Cyber- 
netics deals, among other, with the qualities of beings, in so far as they 
are measures. This makes possible to give a concrete and practical 
content to the general statement of Marxist philosophy, that quality 
and quantity form a unity within a given measure, as it has introduced 
a computable equivalent of measure, viz. the notion of stability or the 
domain of stability” (ibid., p. 381). °The fact that quantitative changes 
within a definite quality lead, when the measure is exceeded, to the 
transformation of this quality to another one, we call ‘the law of the 
transformation of quantitative changes to qualitative one’ “(ibid., p.382). 

We conclude: the ’quality’ of a system means the essential charac- 
teristic of its structure and functioning, a measure of the quality being 
the domain of ergodicity. In particular, the quality of an ergodic system 
So ¢K,,, is changed if, and only if the system is brought out of its domain 
of ergodicity. This corresponds to the transformation of quantitative 
changes to qualitative one. Quantitative changes within the domain of 
ergodicity leave the quality of the system untouched.*" 

As to the general notion of dialectical development, we can here 
precise our earlier introduction (pp. 141—143) by quotations from 
MPhi: ’The transformation of a quality to another we call a ‘dialectical 
jump’... the dialectical jumps can be divided, according to the kind 
of dependence of the qualitative change on the preceding quantitative 
changes, to those determined linear-causally and to those determined 
non-linear-causally” (MPhi, p. 406). °>The more complicated and complex 
an object is, the more importance has its self-dynamics in comparison 
with the linear-causal dependence on the environment” (/bid., p. 407). 
”The non-linear-causally determined transformations of quality are no 
more mechanically but in the dialectical sense determined” (ibid., p. 410). 

An important distinction between different kinds of dialectical jumps 
is introduced as follows: ”Through a dialectical jump an object can 
be destroyed, a material system ruined, made unable to work. In this 
case we speak of a system-destroying transformation of quality. However, 


34. Existing cybernetic theory has no detailled model to explain how the con- 
struction of the new quality, i.e. the construction of the new coupling scheme and 
the new mode of action, is established after the breaking down of the old system. 
This is meant when saying that there is no mathematical model for the transforma- 
tion of quantities into qualities as yet. However, the criterion of the breaking 
down of the old system, viz. the exceeding of the limits of ergodicity can be given 
as is done here. 


204 


CHAPTER V 


a dialectical jump can also be of the kind that the basic quality, the 
essence of the object, the system etc. is not touched, or even the stability 
or the maintenance of the basic quality is advanced. Such dialectical 
jumps we call system-conserving jumps” (MPhi, p. 412). 

More precisely: The state of affair underlying the system-conserving 
dialectical jumps can be explained rather clearly by way of example of 
multistable cybernetic systems. Such a system ts able to react to external 
disturbances, not as a whole but letting one or some of its subsystems 
move to a qualitatively new behaviour. ...Such a multistable system 
is for instance the human brain, which does not react to external stim- 
ulation as a whole but mobilizes, according to the kind of the offered 
task, the corresponding centers in the brain. The qualitative changes 
occurring in the partial systems of a multistable system are system- 
conserving jumps, as their end is to preserve the total system and its 
basic quality” (MPhi, p. 412—413). 

Dialectical development, finally, is defined in terms of the system- 
conserving dialectical jumps: "Without any limitation, we can thus say 
that the development finally can advance only through such jumps of 
quality, which have the system-conserving character” (MPhi, p.413). 
“By dialectical negation we understand a system-conserving transfor- 
mation of quality” (ibid., p. 416). 

The newly introduced notion of dialectical negation leads us to the 
meaning of the development towards “higher qualities’: By a positive 
difference of development we understand the result of dialectical nega- 
tions, which lead to the growth of the stability and the autonomy of 
the system; correspondingly, by a negative difference of development 
we understand the result of dialectical negations, which leads to the 
decrease of the stability and the autonomy of the system, If the difference 
of development is positive, we speak of a development forwards (H6- 
herentwicklung) or of a progressive development; if the difference of 
development is negative, we speak of a development backwards 
(Riickentwicklung) or also of regressive development” (MPhi, pp. 
424—425). 

We conclude: dialectical materialism asserts the existence of the 
“islands of development” in the world, i.e. of systems Sg ¢ Kmerg Of 
the multistable, or rather multi-ergodic type. Such systems are charac- 
terised by development, through successive system-conserving dialectical] 
jumps, towards ever greater ergodicity and autonomy with respect to 
their environments. Each of those jumps occurs in a certain part of 
the total system Sg, and is system-destroying from the point of view 


205 


CHAPTER V 


of the subsystem: it occurs as a consequence of the accumulation of 
disturbances in that subsystem, whereby the limits of the domain of 
ergodicity of the subsystem are exceeded so that a qualitative change 
(of structure and function) is called forth in the subsystem. The devel- 
opment occurs on strictly causal basis, and is due solely to the structure 
Str(Sg) of the multi-ergodic system in question. 

An important system of the multi-ergodic type is the total social 
system of mankind: ”The proletarian revolution for instance is a system- 
destroying transformation of quality, which leads to the collapse of 
the capitalistic production relations. However, the capitalistic system 
is a historically determined subsystem of the total system of “human 
society’, which as a whole is only strengthened by the mentioned trans- 
formation of quality, and achieves a higher stage of development. The 
transformation of quality, which is system-destroying for the subsystem, 
thus is system-conserving for the total system” (MPhi, p. 413). 

Let it be noted that we have not excluded probability from our world. 
As probabilistic laws can be considered as laws produced by the co- 
operation of a large number of non-dominating causal factors, the above 
construction allows the appearance of probabilistic laws just as well 
as it is based on the universality of strict causal determinism. However, 
the notion of chance, such as it appears in Marxist philosophy, has 
a more profound content to be studied more closely in the following. 


2 $ The Notions of Necessity, 
Possibility, Chance and Freedom 


1/A Tangential Model of Development 


”The tendency that growing autonomy leads to a development of 
systems, independent of the quantitative and qualitative changes of the 
environment, that the transformations of quality are no more simply 
linear-causally determined enters a new stage with the birth of human 
society. When the plants and animals are adapting themselves, by means 
of complicated mechanisms of regulation, to the environment, so their 
autonomy is in so far restricted that they are not able to create their 
conditions of life, their environment. The human society, on the other 
hand, includes to its system the environment itself in an increasing 
extent in the course of historical development . .. Thereby are the causes 


206 


CHAPTER V 


of development accumulated almost exhaustingly into the system of 
human society” (MPhi, p. 409). 

In view of this tendency of development it is not too unrealistic to 
represent the total human world (see Introduction p. 8), by a closed 
system S, whose structure is reduced to the coupling of the active elements 
with one another: 


(12) Str(S) = Coupl(£, ,.., En). 
Let the time be indicated by successive integers, 
(13) Sky = hh 7" 2 ees 


let the sets of the thinkable total input and output states of S at the 
moment ¢ be 


(14) X,o R™, Y, cS R’, 


respectively, and let the causality and the coupling structure in the 
system S be defined by the respective functions 


(15) T: X, —> Y,4,, (causality), 
(16) C: Y,-—>X, (coupling). 


The function 7 thus describes, how the input states of the elements 
of S at the moment ¢ are transformed to the output states of the same 
elements at the moment 4+-1. The coupling function C indicates, how 
the output states of the elements at the moment ¢ influence the input 
states of the other elements. We can interpret C as representing the 
social and other couplings established between the elements (= house- 
holds, or individuals, or institutions, or whatever subsystems) of S in 
the course of the earlier history of the system S. In a similar way T 
represents the mode of behaviour of the elements as it has been moulded 
in the course of the earlier history. 

The limitations (13) and (14) are hardly too fatal for our purpose. 
We digitalize the time and let the system have m input and 7 output 
channels (distributed somehow over the WN elements each element E, 
having at least one input and one output channel), each thinkable input 
Or output in each channel being represented by a real number. 

The limitations (15) and (16) are essential, as they fix the structure 
and the functioning of our system once for all. Due to the function 
character of JT and C, involved in the assumptions (15) and (16), the 
model system S is not able to change its structure or its way of func- 


207 


CHAPTER V 


tioning: it may be ergodic but not multi-ergodic. It follows that our 
system S cannot improve its ergodicity, while the real system of man- 
kind can do that by qualitative changes in subsystems. S can model 
the real system of mankind only temporarily, in a certain interval of 
time between two successive qualitative changes of the latter. That is 
why we call S a tangential model. 

By virtue of (15) and (16) we get two equivalent descriptions of the 
process of interaction occurring in S: 


(17) 9 =CoT: X,—>X,,, (the input process), 
(18) y 


Choosing the former and defining the compositions 


ToC: Y,—Y,,, (the output process). 


(19) oP? = 909, 9% = 9op70@g, ele. 
we have 
(20) x(t+k) = 9*(x(s)). 
Let there be 
(21) D,< X,, D, # @. 
such that 
(22) x(t) © D, iff o*(x(t)) + 2,(t +k) when k > o. 


Here g, is a fixed function of time ¢+k independent of the initial value 
x(t). Then D, is the domain of ergodicity of our system S at the moment ¢. 

A multi-ergodic system can be modelled by an ergodic system only 
within a certain interval of time. In our digital time, let it be the moment 

= z in the environment of which the processes of interaction in the 
ergodic model system S and in the multiergodic real system of mankind 
approach one another. Then we can say that the real system of mankind 
is at the moment z developing towards the goal represented by the 
direction or goal function g,, the domain of ergodicity of the real system 
too being D,. Soon after the moment z the development of the real 
system may depart from that of the model. The real system may develop, 
as a consequence of disturbances exceeding the domain of ergodicity 
of some subsystem, a qualitative change in some of its subsystems. 
This in turn may lead to an extension of the domain of ergodicity of 
the total system. Let us keep in mind this difference between our model 
S and the real system of mankind later, when applying the model. 


208 


CHAPTER V 


The objective laws of development of human society are represented 
by the function 9 (or 4) in our model. What does the existence of such 
laws mean? It means that there are certain conditions, moulded in the 
course of the earlier history of humanity, and appearing at the moment 
z under discussion now as necessities which limit the possible ways 
of development of the social system — and the freedom of action of 
the individuals — in a definite way. Let us now study these conditions 
in more detail. 


2 / The Semantics of Logical Modalities 


Let us now consider the situation of our model at the moment z, 
when the model is supposed to give a first approximation of the real 
social system of mankind. Choosing the 9-representation of laws, 
the sct ¥, = X gives the set of all thinkable states of the system S (in 
the following, when referring to the moment z, we shall leave off the 
time index 2). In Marxist terminology, every element x e X then represents 
a thinkable state of motion of mankind, the inner law of motion of 
mankind being represented by the function 9. For the sake of simplicity, 
let us assume henceforth that the set X is finite (without this simplifica- 
tion we should have to speak in the following of the elements of Borel 
ficlds of subsets of X and of their o-algebras instead of the subsets 
of X). 

Let us generate a language L, of sentences p in the following way. 
Let p, be the sentence of L, claiming that the world is in a state be- 
longing to the subset A of all the thinkable states, A < X. The logical 
operations of negation, conjunction, disjunction and material implica- 
tion are then defined by 
(24) | ~Pa =Px-a> Pa & Pa = Pa f\ By PAVPs = Pa: 

(Pa = Pa) = Px-a py (x-By for every A, B,.. CX. 


Here X—A means the complement of A in the set X. 

Let the system of mankind, where x is the real state, be denoted 
by S,. For every element xeX there is thus a ’thinkable world’ S, 
which can be discussed in the language L,. These ’worlds’ are the seman- 
tic models of L,. The total set of semantic models of L,, or of the 
*thinkable worlds’, is given by 


(25) {S,; xeX} = SM(L,). 


14 — Cybernetic method... 209 


CHAPTER V 
We define the truth values W(true) and F(false) for the sentences 
of L, as follows: 
(26) Wert(py, S,) = W iff xe A, 
(27) Wert(p,, S;) = F iff xeX—A. 


We define the general validity ag,(p) of a sentence p, the realizability 
ef,(p) of p, and the semantic equivalence p seme q, q of two sentences 
p and gq in L, as follows: 


(28) ag.(p) iff Wert(p, S,) = W for all x eX, 
(29) ef,(p) iff Wert(p, S,) = W for at least one xe X, 
(30) psemeq, q iff Wert(p, S,) = Wert(q, S;) for all x2, 


The symbols ag and ef here come from the German words ‘allgemein- 
giltig’ and ’erfiillbar’ used by the GDR school of logicians. 

It follows from the above definitions (24)—(30) that all the axioms 
and theorems of ordinary propositional logic — all the logical truths of 
this logic — are generally valid in L,, i.e. semantically equivalent to 
the sentence py. Similarly, each proposition p, for a non-empty A Is 
realizable, and every sentence of L, is semantically equivalent to some 
sentence pg with some B < X. Logical contradictions are sentences 
which are semantically equivalent to p,, where @ is the empty set. 

So we have shown above how to describe our tangential develop- 
mental model in terms of ordinary, classical logic. However, such a 
description completely misses the essential point of the model, viz. 
the goal-directedness of the process displayed by it. To do full justice 
to this basic property we must construct a non-classical logic. 

We must take into account that the laws of development restrict the 
possible states of mankind to those belonging to the domain of ergoc- 
ity D. This means that not all of the thinkable semantic worlds are 
possible but possible are only those S, for which xe D. Thus, instead 
of SM(L,) we have to consider the following set of semantic models: 


(31) {S,; %¢D} = SM(L). 


The condition x «D appears as a necessity dictated by the objective 
laws of development. Its character of necessity involves that human 
beings have no absolute freedom to choose any thinkable way of de- 
velopment x «X, but they are bound to go some of the ways of develop- 
ment inside the domain of ergodicity. Corresponding to this necessity 


210 


CHAPTER V 


there is the impossibility of the inner states of motion x e X—D, incom- 
patible with the laws of development. 

To express the modal category of necessity we can construct the new 
language L preserving the sentences, the axioms and the rules of inference 
of L, but giving a new semantic interpretation to the sentences of L,. 
Instead of the two semantic values true and false we introduce four of 
them: MN (necessity), U (impossibility), M (possibility), and M’ (con- 
tingens). 

The statement that x «D is necessary is then expressed by 


(32) Wert(p,) = N iff DC A. 

and the statement that x e ¥—D is impossible by 

(33) Wert(p,) = U iff A < X—D. 

Possible are all the cases which are not impossible. Thus 
(34) Wert(py) = M iff AND # o. 

The semantic value M’ indicates the non-necessary cases: 
(35) Wert(p,) = M’ iff not D < A. 


These definitions of the four semantic values can be extended to any 
sentence of L by requiring further that 


(36) Wert(p) = Wert(g) if p syneq q. 


Here p syneq g means that p > q and g = paccording to the common 
syntax of the languages L, (which was constructed earlier) and L (to 
which it was extended by definition of L). 

If %, 21, WN, and %™M’ are the classes of necessary, impossible, possible 
and contingent statements of L we then have: 


(37) m=nu(MnM), mM =A vu (Nn M’). 


Here 9% n M’ represents the category which was by Aristotle called 
two-sided possibility’, i.e. the category of statements which are both 
possible and non-necessary. The total class of all the sentences of L 
are obviously exhausted by the following union of mutually disjoint 
classes: 


(38) F=Nvu av (mn mM’). 


In words: every proposition expressible in L is either necessary, or 
impossible, or then it is both possible and non-necessary. 


211 


CHAPTER V 


It is the category MM nN MN’, comprising the sentences which are both 
possible and non-necessary, that is close to the Marxist category of 
possibility. The existence of this logical category in a logic of develop- 
ment implies that, within the limits of the domain of ergodicity, the 
real inner state of motion of mankind is not fully determined by his- 
torical necessity. Within the limits of the domain of ergodicity there 
is room for chance and for the active effort of man. In other words, 
the domain of ergodicity indicates the limits within which the de- 
velopment of mankind is determined not by historical necessity but 
by chance and by active human pursutt. 

Here chance is to be understood as objectively existing, 1.e. as the 
result of numerous non-dominating causal factors (cf.p. 206) giving 
their contribution to the causal determination of the real state x. The 
active human pursuit is also to be understood as an objective causal 
factor contributing to the causal determination of the real state x. 

Subjectively, the fact that human activity is able to influence the course 
of human history within the limits of the domain of ergodicity may be 
expressed as the freedom of will. Since the dialectical development of 
the real multi-ergodic system of mankind means growing ergodicity 
(cf. p. 205), this definition is not disagreement with the well-known 
definition given by Engels, and relating the increase of freedom with 
progressive development. 


3 § The Materialistic Foundation 
of Modal Logic and Deontics 


1 / The Logical Truths and the Dialectical Truths 


The logical modalities of necessity, possibility, impossibility, and 
contingens were defined as characteristics of the sentences of the language 
L with respect to the whole set SM(L) of semantic models. Picking up 
a particular ’possible world’ S, out of SM(L) we can define the ’ordi- 
nary’ truth values W(true) and F(false), and relate them with the semantic 
values NV, U, M, and M’. 

For a given ’possible world’ S,«SM(L), xe D, we can introduce the 
definitions 


(39) Wert(py, S;) = W iff xe A, 


212 


CHAPTER V 


(40) Wert(py, S,) = F iff xeX—A. 


The general validity ag, the realizability ef, and the semantic equivalence 
are thus for the sentences of the language L defined by 


(41) ag(p) iff Wert(p, S;) = W for all xD, 
(42) ef(p) iff Wert(p, S,) = W for at least one xeD, 
(43) p semeq q iff Wert(p, S,) = Wert(g, S,) for all xD, 


respectively. 

Comparing (41)—(43) with (28)—(30) we see that for every p satisfying 
ag,(p) we have a fortiori ag(p). Thus every logical truth of the language 
L, is a logical truth of the language L too. We also see that p semeq, 
q implies p semeq q so that every pair of semantic equivalent sentences 
of L, are semantically equivalent in L too. However, ef,(p) does not 
imply ef(p): realizability in the language L, is not sufficient for reali- 
izability in L. 

On the other hand, by comparing (39)—(40) with (32)—(35) we see 
that the modalities VN, U, M, and M”’ are connected with the truth 
values W and F as follows: 


(44) Wert(p) = N iff Wert(p, S;) = W for all xeD, 
(45) Wert(p) = U iff Wert(p, S,) = F for all xD, 
(46) Wert(p) = M iff Wert(p, S,) = W for at least one x « D, 
(47) Wert(p) = M’ iff Wert(p, S;) = F for at least one xe D. 


Calling the necessarily true propositions dialectical truths we have, by 
(41) and (44), the semantic equivalence of all the dialectical and logical 
truths: 


(48) if ag(p) and Wert(q) = N, then p semeq gq. 


Let us think about the result (48) a little. What does it mean? The 
dialectical truths, i.e. the necessarily true statements in L, express the 
requirement that the real world must at any moment obey the objective 
laws of development, i.e. that the real state of motion of mankind 
must be within the domain of ergodicity D. The truths of formal logic 
are of course not outside of this requirement, as even formal logic 
must be considered to reflect the material reality. Thus the domain of 
validity of logical truths cannot be larger than the domain of validity 
of dialectical truths. Our result only says that, in view of this limitation, 


213 


CHAPTER V 


it is maximal: the logical truths have the maximal domain of validity, 
and thus the same domain of validity as have the (other) dialectical 
truths. 


2 / Strict Implication and the Modal Operators 


So far we have formulated our developmental logic in terms of the 
language L, where the four fundamental modalities VN, U, M, and M’ 
appear as the four semantic values of the sentences of L with respect 
to the whole set SM(L) of ’possible worlds’. Instead of this we can just 
as well represent the basic modalities in a language ML, where they 
appear as operators applying to the sentences of ML. A language in 
which necessity, possibility, impossibility, and contingens appear as 
operators is called modal logic. 

The construction of the developmental modal logic ML on the basis 
of the developmental logic L is obvious. We have only to move all the 
sentences, logical connectives and axioms from L to ML, and add the 
new sentences Np, (for each necessarily true sentence p), Up (for each 
necessarily false sentence p), Mp (for each possibly true sentence p) 
and M’p (for each possibly false sentence p). The rules of operation 
with the operators N, U, M, and M’ can be derived from the corre- 
sponding rules of the semantic values NV, U, M, and M’ in L. 

Obviously, all the modal operators can be expressed in terms of 
one of them, say, N as follows: 


(49) Up =N~p, Mp =~Nw~p, M’p = ~Np. 


Thus the axioms and the rules of inference determining the operation 
with modal operators can be formulated as axioms and rules for N. 

In the formal, axiomatic construction of modal logic there has been 
an ambiguity, which in the early history of modal logic produced some 
confusion. We know that in the ordinary propositional calculus all the 
axioms can be given the form of generally valid implications, 


(50) ag(p = q). 
Can one take the axioms of propositional calculus such as they are, 


1.e. as generally valid ordinary implications, or should one construct 
separate axiomatics in terms of the N-implications 


(S1) N(p = q)? 
In distinction from the ordinary ’material implication’ the N-implica- 


tions were called ’strict’. 


214 


CHAPTER V 


Historically, the first formulations of modal logic (by C.J. Lewis) 
followed the latter line, while later on it was shown (by K. Gédel) that 
the former method could be used as well. 

We can see the reason of the ambiguity associated with strict implica- 
tion: all the generally valid material implications are in L, by virtue of 
(48), semantically equivalent to the corresponding strict implications: 


(52) ag(p = q) semeq M(p = q). 


Accordingly, it is irrelevant whether the axioms of modal logic including 
implications are written in terms of material or strict implications. 

It follows from the prededing that the axioms and the rules of inference 
of ordinary propositional logic can indeed be moved as such from 
L to ML. The additional axioms required to determine the operation 
with modal operators can for instance be given the following form: 

1° Np = p, 

2° Np - NNp (or, in view of 1°, N2p = Np), 

3° Np = (N(p = q) = Nq). 

The first axiom expresses the fact that a statement of necessity is 
logically stronger than a statement of truth, i.e. the fact that a necessary 
truth is true in every possible world. The second axiom excludes, to- 
gether with the axiom 1°, the artificial modal categories produced by 
the formal possibility of the powers of N: N“p (and it also eliminates 
the mixed categories of higher order: NMNM’UMN ..Np etc.). The 
third axiom is, by trivial transformations, converted to the form 
Np&Mq = M(p&q), which is easily seen to be generally valid, in 
view of the semantic models SM(L). 


3 / The Materialistic and the Idealistic Approach in Modal Logic 


We have build all the languages L,, Land ML above way by of semantic 
construction. In other words, we started with the developmental model 
of reality S, and derived from it the ’thinkable worlds’ (for L,) and 
the "possible worlds’ (for L and ML) on the basis of which we con- 
structed the languages. In such a construction the syntax of the language 
appears as a consequence, not as a Starting point. 

A semantic construction of this kind corresponds to the philosophical 
standpoint of materialism, according to which material reality is the 
primary existing thing, and the content of consciousness including 
language is produced by it by reflection. 


215 


CHAPTER V 


Historically, modal logic was developed by means of the opposite 
approach. Modal logic like so much in the development of modern 
formal logic has been historically closely connected with philosophical 
idealism. Indeed modern logic as it stands is still mainly a product of 
mathematicians and philosophers who stand close to neopositivism, 
or to some more fashionable forms of philosophical idealism. It 
is characteristic of their approach to logic that language is for them 
the primary thing: at the height of the neopositivistic stream in the 
1930’s one even attempted the construction of a purely formal universal 
language of science by agreeing on some axioms and rules of inference, 
after which whole logic and mathematic were to be a formal play with 
symbols. 

Now, in the 1970’s, the idea of a purely syntactic construction of 
logic is definitely out. However, the general idea of starting with language 
is still there and dominates the method of Western modal logicians. 
This method could be called the method of almost syntactic approach. 
It consists of the attempts 

1) to build modal logic and deontics by experimenting with different 
kinds of axiom systems, and 

2) to ”verify” the truth of these axioms by references to the verbal 
habits existing in the natural language. 

The method of almost syntactic approach of logic is bound to lead 
to difficulties almost as surely as the neopositivistic method of purely 
syntactic construction. An example of this is the confusion related 
with the notion of strict implication (see p. 214) in the original con- 
struction of modal logical systems by C.I.Lewis. Another example is 
the confusion related with the first axiomatic deontic logic by E. Mally. 
The school of the ’almost syntactic approach’ itself gives as explanation 
for these and other confusions met in modal and deontic logic the 
truism that errors have been made when applying the rules (1) or (2): 
lacking caution in formulation of the axioms or insufficient vigilance 
to the nuances of natural language. (e.g. the articles of J. Hintikka 
and G. von Wright in R. Hilpinen (Ed.) Deontic Logic, Dordrecht 
1971). 

From the materialistic point of view, the reason of these difficulties 
is more profound. It is to be found in the lack of solid semantic founda- 
tion of modal or deontic logic when based on the nuances of natural 
language. Instead of strained subjective introspection one should seek 
the foundation of modal logic and deontics in the objective laws of 
development of mankind. 


216 


CHAPTER V 


4 /1Is Many-Valued Modal Logic Needed? 


Historically, both the difficulties of the first systems of modal logic 
(by Lewis) and of deontic logic (by Mally) were originally interpreted 
as suggesting the need of many-valued logic for the modal logic (sug- 
gested by J. Lukasiewitz) and for deontic logic (as suggested by K. 
Menger). Deontic logic will be discussed later on. We consider here 
only the arguments given by Lukasiewitz for the need of many-valued 
logic in modal logic. 

Lukasiewitz, following the method of the almost syntactic approach, 
wanted to make modal logic by listening to the nuances of natural 
language. So he came to the conclusion that the statement ’if ~ p, 
then ~ Mp’ should be a theorem of modal logic (J. Lukasiewitz, Philo- 
sophische Bemerkungen zu mehrwertigen Systemen des Aussagen- 
kalkuls, Comptes Rendus, Soc. Sci. Letters Varsovie XXIII, 1930, p. 
51—77). He then showed that this together with the axiom Np = p 
leads to logical contradiction in the two-valued logic. Another argument 
used by Lukasiewitz was based on the belief that if we allow the two- 
sided possibility, i.e. Mp & M ~p, then we should admit that everything 
is possible: Mq. 

However, if we build modal logic on the basis of a model of develop- 
ment we see at once the following. The statement ’~p’ refers to a 
definite possible world’ S,, while the statement ’~ Mp’ refers to the 
whole class SM(L) of all ’possible worlds’. Thus ’if ~ p, then ~ Mp’ 
cannot be a theorem of modal logic. In a similar way, noticing that the 
modal categories NV, U, M, and M’ refer to the total class of ’possible 
worlds’, we see that from pe I% n M’ it does not follow that every 
ge M. 

Thus the arguments of Lukasiewitz in favour of many-valued modal 
logic are not correct. Of course this does not mean denying the value 
of many-valued logic in other contexts. 


5 / Objective Ethical Value 


Modal logic, when based on the objective laws of development, neces- 
sarily implies a certain deontic logic. This is seen as follows. 
Let 


(53) {ay(x); Ac A, xeX} = a(X) 
be the set of thinkable acts by human beings in the situation existing 
in the mankind at that moment z to which the set of thinkable inner 


217 


CHAPTER V 


states of mankind X¥ = X, refers. The argument x of a(x) is the state, 
x ¢ X, in which mankind ts left by the act a)(x). We let everything else, 
which characterizes the act a>(x) — like the actor, for instance — be 
described by the index 4, as all this is irrelevant for us now. The set 
of all the possible acts allowed by the laws of development is then 
given by 


(54) {ax(x); re A, xe D} = a(D). 


In words: possible are the acts resulting in a state of mankind which 
is within the domain of ergodicity of the system of mankind. 

We must admit, of course, that the result x of an act a,(x) is not 
determined by the conscious purpose of the actor only. It is determined 
partly even by chance, understood as an objective causal factor (cf. 
p. 212). However, in our present description of acts all this is included 
to the description of a)(x) given by the index A. Noticing this we can 
say that (54) expresses, in our present representation of acts, the limits 
of the freedom of human action existing in this particular situation 
of mankind. 

Let x’(t) and x”(t) be two solutions of the inner law of motion, 
x(t+1) = 9(x(t)) of S, and & (t) and &’’ (£) two state-functions of the 
real, multi-ergodic system of mankind, such that 


&’(z) = x’ (z) , -&” (z) = x’? (2), and 
(55) &’ (z+ 9) produces a larger domain of ergodicity than 
&’? (z+6) does 


in the real-multi-ergodic system of mankind. Here 6 is a fixed 
magnitude of time. Let us rewrite (55) as a relation of weak ordering, 


(56) cae ae oe 

We write further: 
(57) x” =x’ iff x” <x’ and x’ < x”, 
(58) x” <x’ iff x” <x’ but not x’ <x”. 


The formula (58) reads: x’ is to be preferred to x”. 

The relation of weak order defined by formulae (55)—(58) determines 
a division of the set of possible functions x(t) to a set of mutually disjoint 
classes of equivalence. Let this division be non-trivial, i.e. let there be 
at least two classes of equivalence. 


218 


CHAPTER V 


Letting each function x be represented by its value at the moment 
Zz, X = x(z), the formulae (55)—(58) define a preference ordering in the 
set D = D, of the possible states of mankind at the moment z, and 
thus a preference ordering in the set a(D) of possible acts. Writing 


(59) a(x”) < a(x’) iff x” <x’ 


we can Say: the act a(x’) is to be preferred to the act a(x”) if, and only 
if the state of affairs x’ is to be preferred to the state of affairs x”. So 
we have formulated an objective system of ethical values and norms. 

The procedure just outlined gives a well defined meaning for the 
derivation of “Ought’ from ’Is’. Indeed, surely the objective laws of 
development must be counted as belonging to ‘Is’. It is not less sure 
that a statement like “the act a(x’) is to be preferred to the act a)(x”)’ 
involves an ’Ought’. The only point of discussion 1s whether the prin- 
ciple of ordering, represented by formulae (55) and (56), should be 
called as ‘derived’ from the objective laws of development or whether 
It represents an autonomous principle of choice. 

The problem is partly verbal. However, there are arguments in favour 
of using the word ‘derivation’ here. What is the content of our principle 
of ordering, as expressed by formulae (55) and (56)? It states that, once 
there are objective laws of development, imposing necessary restricti- 
ons on the pursuits of human beings, and a measure (the domain 
of ergodicity) for the freedom of man from these necessities, such 
states of affairs are to be preferred which advance the development 
towards greater freedom. We surely can give different formulations 
for the underlying principle of ordering differing from our formula 
(55), but they would only reflect somewhat different aspects of the 
objective laws of development — and thus be just as well derived 
from these laws. Thus I come to the conclusion that those philo- 
sophers who accept that a certain kind of ’Ought’ is derivable from 
Is’ have assumed a more fruitful stand as their opponents. 


6 / The Modal Logic of Social Structures 


The inner structure of social systems cannot be adequately repre- 
sented in terms of the logical device so far discussed in this chapter. To 
be able to characterize properties of cybernetic couplings and elements 
in terms of a logical language we must move from the logic of sen- 
tences (like L,, L, and ML) to the logic of predicates. We shall now 


219 


CHAPTER V 


briefly discuss the semantic foundation of such a modal logic of social 
structures as based on the existence of the objective laws of human 
development. 

Considering the developmental model S, described in §2.1, as a 
cybernetic whole its structural organization is represented by the couple 
(C, T) (cf. p. 121). We shall here call (C, 7) the inner structure or, in 
short, the structure of the world S. Given (C, T), the inner law of motion 
of mankind 9 is represented by = CoT (cf. p. 208). 

Let 


(60) {(Co, To); eZ} =$ 


be the set of all combinations (C, T) compatible with a given total 
number m of input channels and » output channels. Then the sct (60) 
represents the set of all thinkable structures of the world in an analysis 
where m input and n output channels are distinguished from one anothe;. 

Not all of the thinkable structures (Cg, Tg) are possible, i.e. realizable 
in view of the objective laws of development. Possible are only those 
structures (Cg, Tg) which give the true inner law of motion 9: there 
must be » = Cgo Tg. Let the latter set be denoted as follows: 


(61) {(Co, To); ¢€ Le} = Sq 


This is thus the set composed of all the possible structures of the real 
world. 

Every structure (Cg, Tg) can be produced, in the general case, by the 
mutual coupling of different numbers of different elementary systems E. 
Let the system Sg having the structure (Cg, Tg) be composed of the 
following set of elements: 


(62) Tg = {E}, ER, .-, Ey}. 
Every element £% is characterized by a given function 
(63) TS: XX —>Y" , XP SCR , Yo R", 


which expresses the way of causal functioning of the element in question 
(cf. equation (6) on p. 146). There is =m, = m, in, =n. 
i 


We define the field of individuals J of the (first-order) logic PL, of 
predicates of social structures as the class of all the elements E which 
can be used to produce any of the thinkable social structures: 


(64) I = {E; Eelg, oe}. 


220 


CHAPTER V 


Thus every predicate p discussed in the language PL, is a characteristic 
of a subset J, of a k(p)-fold cartesian product of J with itself: J, < I*). 
Two predicates p and q are semantically equivalent if, and only if J, = Jj. 
The logical connectives between given p and q are associated with 
the corresponding set operations between the sets J, and J). 

The one-member predicates then give the descriptions of any thinkable 
properties of the individual elements. The two-member predicates give 
the descriptions of any thinkable relations between any two elements, etc. 

Let the set of all the elements E which appear in the construction of 
some or all of the systems Sg having a structure (Cg, Tg) ¢ Sp be 


(65) lp = {E; E elo, a € Xo}. 


Then a one-member predicate p is necessary if, and only if Ip ¢ J. 
It is impossible if, and only if J, © J—Ig. It is possible if, and only if 
I, Ig # S,and contingent if, and only if not Jp < J,. In a similar way, 
a two-member predicate p is necessary if, and only if Ip X Ip < I, 
impossible if, and only if J, ¢ J X I—IpgXJp», possible if, and only if 
I, 1 IgXIg # ©, and contingent if, and only if not Jp Ip < J,. Con- 
tinuing in this way the modal categories V, U, M, and M’ can be defined 
for all the predicates discussed in Ply. 

Every system Sg having the structure (Cg, Tg) such that o « Xo defines 
a "possible world’. Let Sg be the real world. Then the set 


(66) {Sg; Ge Xp} = SM(PL,) 
is the set of all the semantic models of our language PL,. The set 
(67) fat ha ema 2 


is the set of the true elements of the world corresponding to the semantic 
model Sg. A one-member predicate p is true in the world Sg if, and only 
if Jg © J,, and otherwise false. A two-member predicate p is true in 
the world Sg if, and only if JgxJg ¢ I,, otherwise false. In this way 
the semantic values W and F can be associated, for each possible world 
Sg separately, with the predicates discussed in PL,. 

Given a semantic model Sg of our language PL, the existence of 
an individual E having the one-member predicate p means that Jg 9 J, 
# @. The existence of a sequence (£,, E,) of individuals having the 
two-member predicate p means that JgxJg 9 I, # @, etc. In this 
way the quantifiers and the truth values of sentences including quan- 
tifiers can be defined in PL. 


221 


CHAPTER V 


The definition of the semantic values N, U, M, and M’ for sentences 
containing quantifiers is obvious. For instance, if the sentence Ex(E)p(E) 
is true for all the possible worlds Sg, Ge Lo, 1.e. for all the structures 
of the world allowed by the laws of development, then it is a necessary 
characteristic of the real world. If the sentence Ex(E)p(E) is true for 
at least one of the possible worlds, it expresses a possible characteristic 
of the real world, etc. 

The above hints suffice to show how our cybernetic logic of develop- 
ment is to be extended for a logico-mathematical discussion of social 
structures. Such a discussion, however, and social science in general 
is outside of the subject of this book. 

Obviously, a two-level theory of social knowledge is suggested. 
The empirical level, concerning the historical process x(t), and 
the theoretical level, concerning the social structures (C, 7) are 
connected with one another by the law of development 9. 


Index of Names 


Arbib, M. 195 
Aristotle 211 
Ashby, W. 88, 145 
Church, A. 194 
Descartes, R. 83 
Engels, F. 121, 200 


Ficker, L. von 5 
Franklin, B. 186 


Greniewski, H. 89, 109 
Godel, K. 17, 194, 215 


Hermes, H. 185 


Hintikka, J. 193—194, 216 


Hume, D. 5 


Kant, I. 5 
Klaus, G. 177, 194—195 
Kolmogorov, A. 86, 193 


Lange, O. 86—87, 94, 96, 159, 


167—168, 202 
Lewis, C. 215—217 
Lukasiewitz, J. 217 


Mally, E. 216—217 

Markov, A. 194 

Marx, K. 146 

McCulloch, W. 85—86, 190, 192 
Menger, K. 217 

Mesarovic, M. 103 


Nagel, E. 194—195 

Neumann, J. von 85—87, 169—172, 
192, 195 

Newman, J. 194—195 


Pavlov, I. 84—85 
Pitts, W. 86, 190, 192 
Planck, M. 131 


Rosen, R. 170 
Shannon, C. 132 


Turing, M. 85, 177, 179, 181—183, 
185—187, 189, 192, 194, 196 


Weaver, W. 132 

Wiener, N. 85 
Wittgenstein, L. 5, 7—8 
Wright, G. von 202, 216 


